search.agent.consumer module

Provides a record processor for MetadataIsAvailable notifications.

exception search.agent.consumer.DocumentFailed[source]

Bases: RuntimeError

Raised when an arXiv paper could not be added to the search index.

exception search.agent.consumer.IndexingFailed[source]

Bases: RuntimeError

Raised when indexing failed such that future success is unlikely.

class search.agent.consumer.MetadataRecordProcessor(*args, **kwargs)[source]

Bases: arxiv.base.agent.BaseConsumer

Consumes MetadataIsAvailable notifications, updates the index.

MAX_ERRORS = 5

Max number of individual document failures before aborting entirely.

index_paper(arxiv_id)[source]

Index a single paper, including its previous versions.

Parameters:arxiv_id (str) – A versionless arXiv e-print identifier.
Return type:None
index_papers(arxiv_ids)[source]

Index multiple papers, including their previous versions.

Parameters:

arxiv_ids (List[str]) – A list of versionless arXiv e-print identifiers.

Raises:
  • DocumentFailed – Indexing of the documents failed. This may have no bearing on the success of subsequent papers.
  • IndexingFailed – Indexing of the documents failed in a way that indicates recovery is unlikely for subsequent papers.
Return type:

None

process_record(record)[source]

Call for each record that is passed to process_records.

Parameters:
  • data (bytes) –
  • partition_key (bytes) –
  • sequence_number (int) –
  • sub_sequence_number (int) –
Raises:

IndexingFailed – Indexing of the document failed in a way that indicates recovery is unlikely for subsequent papers, or too many individual documents failed.

Return type:

None