arxiv.canonical.classic.abs module¶
Parse fields from a single arXiv abstract (.abs) file.
-
class
arxiv.canonical.classic.abs.
AbsData
(identifier, submitter, submitted_date, announced_month, updated_date, license, primary_classification, title, abstract, authors, size_kilobytes, submission_type, secondary_classification, source_type, journal_ref, report_num, doi, msc_class, acm_class, proxy, comments, previous_versions)[source]¶ Bases:
tuple
-
property
abstract
¶ Alias for field number 8
-
property
acm_class
¶ Alias for field number 18
-
property
announced_month
¶ Alias for field number 3
Alias for field number 9
-
property
comments
¶ Alias for field number 20
-
property
doi
¶ Alias for field number 16
-
property
identifier
¶ Alias for field number 0
-
property
journal_ref
¶ Alias for field number 14
-
property
license
¶ Alias for field number 5
-
property
msc_class
¶ Alias for field number 17
-
property
previous_versions
¶ Alias for field number 21
-
property
primary_classification
¶ Alias for field number 6
-
property
proxy
¶ Alias for field number 19
-
property
report_num
¶ Alias for field number 15
-
property
secondary_classification
¶ Alias for field number 12
-
property
size_kilobytes
¶ Alias for field number 10
-
property
source_type
¶ Alias for field number 13
-
property
submission_type
¶ Alias for field number 11
-
property
submitted_date
¶ Alias for field number 2
-
property
submitter
¶ Alias for field number 1
-
property
title
¶ Alias for field number 7
-
property
updated_date
¶ Alias for field number 4
-
property
-
class
arxiv.canonical.classic.abs.
AbsRef
(identifier, submitted_date, announced_month, source_type, size_kilobytes)[source]¶ Bases:
tuple
-
property
announced_month
¶ Alias for field number 2
-
property
identifier
¶ Alias for field number 0
-
property
size_kilobytes
¶ Alias for field number 4
-
property
source_type
¶ Alias for field number 3
-
property
submitted_date
¶ Alias for field number 1
-
property
-
arxiv.canonical.classic.abs.
NAMED_FIELDS
= ['Title', 'Authors', 'Categories', 'Comments', 'Proxy', 'Report-no', 'ACM-class', 'MSC-class', 'Journal-ref', 'DOI', 'License']¶ Fields that may be parsed from the key-value pairs in second major component of .abs string. Field names are not normalized.
-
exception
arxiv.canonical.classic.abs.
NoSuchAbs
[source]¶ Bases:
RuntimeError
-
arxiv.canonical.classic.abs.
REQUIRED_FIELDS
= ['title', 'authors', 'abstract']¶ Required parsed fields with normalized field names.
Note the absense of ‘categories’ as a required field. A subset of version- affixed .abs files with the old identifiers predate the introduction of categories and therefore do not have a “Categories:” line; only the (higher- level) archive and group can be be inferred, and this must be done via the identifier itself.
The latest versions of these papers should always have the “Categories:” line.
-
arxiv.canonical.classic.abs.
iter_all
(data_path, from_id=None, to_id=None)[source]¶ List all of the identifiers for which we have abs files.
The “latest” section will have an abs file for every e-print, so that’s the only place we need look.
- Return type
-
arxiv.canonical.classic.abs.
latest_path_month
(data_path, identifier)[source]¶ Get the base path for the month block containing the “latest” e-prints.
This is where the most recent version of each e-print always lives.
- Return type
-
arxiv.canonical.classic.abs.
list_versions
(data_path, identifier)[source]¶ List all of the versions for an identifier from abs files.
This works by looking at the presence of abs files in both the “latest” and “original” locations.
- Return type
-
arxiv.canonical.classic.abs.
original_path_month
(data_path, identifier)[source]¶ Get the main base path for an abs file.
This is where all of the versions except for the most recent one live.
- Return type
-
arxiv.canonical.classic.abs.
parse_first
(data_path, identifier)[source]¶ Parse the abs for the first version of an e-print.
- Return type