arxiv.canonical.domain.content module¶
Core concepts for characterizing bitstream/version content.
-
class
arxiv.canonical.domain.content.
ContentType
[source]¶ Bases:
enum.Enum
Characterization of the content type of an individual bitstream.
-
abs
= 'abs'¶
-
dvi
= 'dvi'¶
-
property
ext
¶ The preferred filename extension for this
ContentType
.- Return type
-
html
= 'html'¶
-
json
= 'json'¶
-
make_filename
(identifier, is_gzipped=False)[source]¶ Make a filename for a bitstream with this
ContentType
.- Return type
-
property
mime_type
¶ The MIME content type for this
ContentType
.- Return type
-
pdf
= 'pdf'¶
-
ps
= 'ps'¶
-
tar
= 'tar'¶
-
tex
= 'tex'¶
-
-
arxiv.canonical.domain.content.
DISSEMINATION_FORMATS_BY_SOURCE_EXT
= [('.tar.gz', None), ('.tar', None), ('.dvi.gz', None), ('.dvi', None), ('.pdf', [<ContentType.pdf: 'pdf'>]), ('.ps.gz', [<ContentType.pdf: 'pdf'>, <ContentType.ps: 'ps'>]), ('.ps', [<ContentType.pdf: 'pdf'>, <ContentType.ps: 'ps'>]), ('.html.gz', [<ContentType.html: 'html'>]), ('.html', [<ContentType.html: 'html'>]), ('.gz', None)]¶ Dissemination formats that can be inferred from source file extension.
Note
This is largely to support format discovery in classic. In the NG canonical record, this should all be explicit.
-
class
arxiv.canonical.domain.content.
SourceFileType
[source]¶ Bases:
enum.Enum
Source file types are represented by single-character codes.
-
Ancillary
= 'A'¶ Submission includes ancillary files in the /anc directory.
-
DCPilot
= 'B'¶ Submission has associated data in the DC pilot system.
-
DOCX
= 'X'¶ Submission in Microsoft DOCX (Office Open XML) format.
-
HTML
= 'H'¶ Multi-file HTML submission.
-
Ignore
= 'I'¶ All files auto ignore. No paper available.
-
ODF
= 'O'¶ Submission in Open Document Format.
-
PDFLaTeX
= 'D'¶ A TeX submission that must be processed with PDFlatex.
-
PDFOnly
= 'F'¶ PDF-only with .tar.gz package (likely because of anc files).
-
PostscriptOnly
= 'P'¶ Multi-file PS submission.
It is not necessary to indicate P with single file PS since in this case the source file has .ps.gz extension.
-
SourceEncrypted
= 'S'¶ Source is encrypted and should not be made available.
-
-
class
arxiv.canonical.domain.content.
SourceType
(value)[source]¶ Bases:
str
Characterizes a version source package.
-
property
available_formats
¶ List the available dissemination formats for this source type.
Depending on the original source type, we may not be able to provide all supported formats.
This does not include the source format. Note also that this does not enforce rules about what should be displayed as an option or provided to end users.
- Return type
-
property
-
arxiv.canonical.domain.content.
available_formats_by_ext
(filename)[source]¶ Attempt to determine the available dissemination formats by file extension.
It sometimes (but not always) possible to infer the available dissemination formats based on the filename extension of the source package.
Note
This is largely to support format discovery in classic. In the NG canonical record, this should all be explicit.
- Return type