arxiv.canonical.services.store module

Persist changes to the canonical record.

Provides a CanonicalStore that stores resources in S3, using serialize.record to serialize and deserialize resources.

class arxiv.canonical.services.store.CanonicalStore(bucket, verify=False, region_name=None, endpoint_url=None, aws_access_key_id=None, aws_secret_access_key=None, read_only=True)[source]

Bases: arxiv.canonical.core.ICanonicalStorage

Persists the canonical record in S3.

The intended pattern for working with the canonical record is to use the domain.CanonicalRecord as the primary entrypoint for all operations. Consequently, this service offers only a single public instance method, :fund:`.load_record`.

Persistence is achieved by attaching members to domain.CanonicalRecord, :class`.domain.Month`, and domain.Listing instances that implement reads/writes to S3. In this way, consumers of arxiv.canonical.domain can largely work directly with domain.CanonicalRecord, and persistence is handled transparently.

can_resolve(uri)[source]

Indicate whether or not the implementation can resolve an URI.

Parameters

uri (D.URI) –

Returns

Return type

bool

Return type

bool

inititalize()[source]
Return type

None

is_available(retries=0, read_timeout=5, connect_timeout=5)[source]

Determine whether or not we can read from/write to the store.

Return type

bool

list_subkeys(key)[source]

List all of the subkeys (direct descendants) of key in the record.

Parameters

key (URI) –

Returns

Items are the relative names of the descendants of key. For filesystem-based storage, this may be equivalent to os.listdir.

Return type

list

Return type

List[str]

load(key)[source]

Make an IO that waits to load from the record until it is read().

Parameters

key (D.URI) –

Returns

Yields bytes when read. This may be a lazy IO object, so that reading is deferred until the latest possible time.

Return type

IO

Return type

IO[bytes]

load_entry(key)[source]

Load a bitstream entry.

Parameters

key (URI) – Key that identifies the bitsream in the record.

Returns

  • RecordStream – The bitstream resource.

  • str – Checksum of the bitstream (URL-safe base64-encoded md5 hash).ß

Return type

Tuple[RecordStream, str]

load_manifest(key)[source]

Load an integrity manifest.

Parameters

key (Key) – Key used to identify manifest in storage.

Returns

Return type

Manifest

Return type

Manifest

property read_only

Determine whether or not this is a read-only session.

This is read-only property to discourage users of this class to mess with it in runtime code. Should only be set via application configuration.

Return type

bool

store_entry(ri)[source]

Store a bitstream entry in the record.

This method MUST decompress the content of the entry if it is gzipped (as is sometimes the case in the classic system) and update the CanonicalFile (ri.record.stream.domain).

Parameters

ri (IStorableEntry) – A storable bitstream.

Return type

None

store_manifest(key, manifest)[source]

Store an integrity manifest.

Parameters
  • key (Key) – Key used to identify manifest in storage.

  • manifest (Manifest) – The manifest record to store.

Return type

None

exception arxiv.canonical.services.store.DoesNotExist[source]

Bases: Exception

The requested resource does not exist.

class arxiv.canonical.services.store.InMemoryStorage[source]

Bases: arxiv.canonical.core.ICanonicalStorage

can_resolve(uri)[source]

Indicate whether or not the implementation can resolve an URI.

Parameters

uri (D.URI) –

Returns

Return type

bool

Return type

bool

list_subkeys(key)[source]

List all of the subkeys (direct descendants) of key in the record.

Parameters

key (URI) –

Returns

Items are the relative names of the descendants of key. For filesystem-based storage, this may be equivalent to os.listdir.

Return type

list

Return type

List[str]

load(key)[source]

Make an IO that waits to load from the record until it is read().

Parameters

key (D.URI) –

Returns

Yields bytes when read. This may be a lazy IO object, so that reading is deferred until the latest possible time.

Return type

IO

Return type

IO[bytes]

load_entry(key)[source]

Load a bitstream entry.

Parameters

key (URI) – Key that identifies the bitsream in the record.

Returns

  • RecordStream – The bitstream resource.

  • str – Checksum of the bitstream (URL-safe base64-encoded md5 hash).ß

Return type

Tuple[RecordStream, str]

load_manifest(key)[source]

Load an integrity manifest.

Parameters

key (Key) – Key used to identify manifest in storage.

Returns

Return type

Manifest

Return type

Manifest

store_entry(ri)[source]

Store a bitstream entry in the record.

This method MUST decompress the content of the entry if it is gzipped (as is sometimes the case in the classic system) and update the CanonicalFile (ri.record.stream.domain).

Parameters

ri (IStorableEntry) – A storable bitstream.

Return type

None

store_manifest(key, manifest)[source]

Store an integrity manifest.

Parameters
  • key (Key) – Key used to identify manifest in storage.

  • manifest (Manifest) – The manifest record to store.

Return type

None