arxiv.submission.services.plaintext.plaintext module¶
Provides integration with the plaintext extraction service.
This integration is focused on usage patterns required by the submission system. Specifically:
Must be able to request an extraction for a compiled submission.
Must be able to poll whether the extraction has completed.
Must be able to retrieve the raw binary content from when the extraction has finished successfully.
Encounter an informative exception if something goes wrong.
This represents only a subset of the functionality provided by the plaintext service itself.
-
exception
arxiv.submission.services.plaintext.plaintext.
ExtractionFailed
(msg, response)[source]¶ Bases:
arxiv.integration.api.exceptions.RequestFailed
The plain text extraction service failed to extract text.
-
exception
arxiv.submission.services.plaintext.plaintext.
ExtractionInProgress
(msg, response)[source]¶ Bases:
arxiv.integration.api.exceptions.RequestFailed
An extraction is already in progress.
-
class
arxiv.submission.services.plaintext.plaintext.
PlainTextService
(verify=True, headers={})[source]¶ Bases:
arxiv.integration.api.service.HTTPIntegration
Represents an interface to the plain text extraction service.
-
class
Status
[source]¶ Bases:
enum.Enum
Task statuses.
-
FAILED
= 'failed'¶
-
IN_PROGRESS
= 'in_progress'¶
-
SUCCEEDED
= 'succeeded'¶
-
-
VERSION
= 0.3¶ Version of the service for which this module is implemented.
-
extraction_is_complete
()[source]¶ Check the status of an extraction task by submission upload ID.
- Parameters
source_id (str) – ID of the submission upload workspace.
- Return type
- Returns
bool
- Raises
ExtractionFailed – Raised if the task is in a failed state, or an unexpected condition is encountered.
-
request_extraction
()[source]¶ Make a request for plaintext extraction using the submission upload ID.
- Parameters
source_id (str) – ID of the submission upload workspace.
- Return type
None
-
retrieve_content
()[source]¶ Retrieve plain text content by submission upload ID.
- Parameters
source_id (str) – ID of the submission upload workspace.
- Return type
- Returns
bytes – Raw text content.
- Raises
RequestFailed – Raised if an unexpected status was encountered.
ExtractionInProgress – Raised if an extraction is currently in progress
-
class