arxiv.canonical.register package

Register for the canonical record.

This module implements the high-level API for the arXiv canonical record. It orchestrates the classes in arxiv.canonical.domain, arxiv.canonical.record, and arxiv.canonical.integrity to implement reading from and writing to the record.

class arxiv.canonical.register.Base(name, domain, record, integrity, members=None)[source]

Bases: typing.Generic

Generic base class for all register classes.

This defines the abstract structure of a register class. It specifies thatpecifically that instances of a register class are composed of a domain object, a record object, an integrity object, and a set of members. This allows us to define register classes that align domain, record, and integrity classes at a specific level of the record hierarchy.

add_events(s, sources, *events)[source]

Add events to this register.

Return type

None

domain = None

The domain object on a register instance.

domain_type = None

The type of the domain object on a register instance.

integrity = None

The integrity object on a register instance.

integrity_type = None

The type of the integrity object on a register instance.

iter_members()[source]

Get an iterator over members in this register.

Return type

Iterable[~_Member]

classmethod load(s, sources, name, checksum=None)[source]

Load an instance of the register class from storage.

Return type

~_Self

member_type = None

The type of members contained by an instance of a register class.

property members

Accessor for the members of a register instance.

Return type

MutableMapping[~_MemberName, ~_Member]

property number_of_events

Number of events contained within a register instance.

Return type

int

property number_of_versions

Number of e-print versions contained within a register instance.

Return type

int

record = None

The record object on a register instance.

record_type = None

The type of the record object on a register instance.

save(s)[source]

Store changes to the integrity manifest for this register.

Return type

str

save_members(s, members)[source]

Save members that have changed, and update our manifest.

Return type

None

exception arxiv.canonical.register.ConsistencyError[source]

Bases: Exception

Operation was attempted that would violate consistency of the record.

class arxiv.canonical.register.IRegisterAPI(*args, **kwargs)[source]

Bases: typing_extensions.Protocol

Interface for the canonical register API.

add_events(*events)[source]

Add new events to the register.

Return type

None

load_eprint(identifier)[source]

Load an EPrint from the record.

Return type

EPrint

load_event(identifier)[source]

Load an Event by identifier.

Return type

Event

load_events(selector)[source]

Load all :class:`.Event`s for a day, month, or year.

Return type

Tuple[Iterable[Event], int]

load_history(identifier)[source]

Load the event history of an EPrint.

Return type

Iterable[EventSummary]

load_listing(date, shard='listing')[source]

Load a Listing for a particulate date.

Return type

Listing

load_version(identifier)[source]

Load an e-print Version from the record.

Return type

Version

class arxiv.canonical.register.ICanonicalStorage(*args, **kwargs)[source]

Bases: arxiv.canonical.core.ICanonicalSource, arxiv.canonical.core.IManifestStorage, typing_extensions.Protocol

Interface for services that store the canonical record.

list_subkeys(key)[source]

List all of the subkeys (direct descendants) of key in the record.

Parameters

key (URI) –

Returns

Items are the relative names of the descendants of key. For filesystem-based storage, this may be equivalent to os.listdir.

Return type

list

Return type

List[str]

load_entry(key)[source]

Load a bitstream entry.

Parameters

key (URI) – Key that identifies the bitsream in the record.

Returns

  • RecordStream – The bitstream resource.

  • str – Checksum of the bitstream (URL-safe base64-encoded md5 hash).ß

Return type

Tuple[RecordStream, str]

store_entry(ri)[source]

Store a bitstream entry in the record.

This method MUST decompress the content of the entry if it is gzipped (as is sometimes the case in the classic system) and update the CanonicalFile (ri.record.stream.domain).

Parameters

ri (IStorableEntry) – A storable bitstream.

Return type

None

class arxiv.canonical.register.ICanonicalSource(*args, **kwargs)[source]

Bases: typing_extensions.Protocol

Interface for source services, used to dereference URIs.

can_resolve(uri)[source]

Indicate whether or not the implementation can resolve an URI.

Parameters

uri (D.URI) –

Returns

Return type

bool

Return type

bool

load(key)[source]

Make an IO that waits to load from the record until it is read().

Parameters

key (D.URI) –

Returns

Yields bytes when read. This may be a lazy IO object, so that reading is deferred until the latest possible time.

Return type

IO

Return type

IO[bytes]

class arxiv.canonical.register.IStorableEntry(*args, **kwargs)[source]

Bases: typing_extensions.Protocol

Minimal interface for a bitstream interface that can be stored.

Services that implement ICanonicalStorage can assume that the attributes of this interface are available on objects passed for storing.

property checksum

URL-safe b64-encoded md5 hash.

Return type

str

name = None

Name of the entry.

property record

Reference to a RecordEntry.

Return type

RecordEntry[~_EDomain]

update_checksum()[source]

Update the integrity checksum for this entry.

Return type

None

exception arxiv.canonical.register.NoSuchResource[source]

Bases: Exception

Operation was attempted on a non-existant resource.

class arxiv.canonical.register.RegisterAPI(storage, sources, name='all')[source]

Bases: arxiv.canonical.core.IRegisterAPI

The main public API for the register.

add_events(*events)[source]

Add new events to the register.

Return type

None

load_eprint(identifier)[source]

Load an EPrint from the record.

Return type

EPrint

load_event(identifier)[source]

Load an Event by identifier.

Return type

Event

load_events(selector)[source]

Load all :class:`.Event`s for a day, month, or year.

Returns an Event generator that loads event data lazily from the underlying storage, so that in general we are loading only the data that we are actually consuming. Events are generated in order.

But be warned! Evaluating the entire generator all at once (e.g. by coercing it to a list) may load a considerable amount of data into memory (and use a lot of i/o), especially if events for an entire year are loaded.

Parameters

selector (int, tuple, or datetime.date) – Indicates the year (int), month (Tuple[int, int]), or day for which events should be loaded.

Returns

  • generator – Yields Event instances in chronological order.

  • int – An estimate of the number of events that will be generated. Note that the actual number may change (especially for large selections) because the record may be updated while the generator is being consumed.

Return type

Tuple[Iterable[Event], int]

load_history(identifier)[source]

Load the event history of an EPrint.

Return type

Iterable[EventSummary]

load_listing(date, shard='listing')[source]

Load a Listing for a particulate date.

Return type

Listing

load_render(identifier)[source]
Return type

Tuple[CanonicalFile, IO[bytes]]

load_source(identifier)[source]
Return type

Tuple[CanonicalFile, IO[bytes]]

load_version(identifier)[source]

Load an e-print Version from the record.

Return type

Version

class arxiv.canonical.register.RegisterDay(name, domain, record, integrity, members=None)[source]

Bases: arxiv.canonical.register.core.Base

Representation of a day-block of e-prints in the canonical register.

domain_type

alias of arxiv.canonical.domain.block.EPrintDay

integrity_type

alias of arxiv.canonical.integrity.version.IntegrityDay

member_type

alias of RegisterEPrint

record_type

alias of arxiv.canonical.record.version.RecordDay

class arxiv.canonical.register.RegisterEPrint(name, domain, record, integrity, members=None)[source]

Bases: arxiv.canonical.register.core.Base

Representation of an e-print in the canonical register.

Organizes a series of one or more :class:`.RegisterVersion`s.

add_event_cross(s, sources, event)[source]

Add a cross-list event.

Return type

List[RegisterVersion]

add_event_migrate(s, sources, event)[source]

Add a data-migration event.

Return type

List[RegisterVersion]

add_event_migrate_metadata(s, sources, event)[source]

Add a metadata-migration event.

Return type

List[RegisterVersion]

add_event_new(s, sources, event)[source]

Add an event that results in a new version.

Return type

List[RegisterVersion]

add_event_replace(s, sources, event)[source]

Add an event that generates a replacement version.

Return type

List[RegisterVersion]

add_event_update(s, sources, event)[source]

Add an event that results in an update to a version.

Return type

List[RegisterVersion]

add_event_update_metadata(s, sources, event)[source]

Add an event that results in an update to metadata of a version.

Return type

List[RegisterVersion]

add_event_withdraw(s, sources, event)[source]

Add an event that withdraws an e-print.

Return type

List[RegisterVersion]

domain_type

alias of arxiv.canonical.domain.eprint.EPrint

integrity_type

alias of arxiv.canonical.integrity.version.IntegrityEPrint

member_type

alias of arxiv.canonical.register.version.RegisterVersion

record_type

alias of arxiv.canonical.record.version.RecordEPrint

class arxiv.canonical.register.RegisterEPrints(name, domain, record, integrity, members=None)[source]

Bases: arxiv.canonical.register.core.Base

Representation of the complete set of e-prints in the register.

domain_type

alias of arxiv.canonical.domain.block.AllEPrints

integrity_type

alias of arxiv.canonical.integrity.version.IntegrityEPrints

member_type

alias of RegisterYear

record_type

alias of arxiv.canonical.record.version.RecordEPrints

class arxiv.canonical.register.RegisterListing(name, domain, record, integrity, members=None)[source]

Bases: arxiv.canonical.register.core.Base

add_events(_, sources, *events)[source]

Add events to the terminal listing R.

Overrides the base method since this is a terminal record, not a collection.

Return type

None

classmethod create(s, sources, d)[source]
Return type

RegisterListing

delete(s)[source]
Return type

None

domain_type

alias of arxiv.canonical.domain.listing.Listing

integrity_type

alias of arxiv.canonical.integrity.listing.IntegrityListing

classmethod load(s, sources, identifier, checksum=None)[source]

Load an instance of the register class from storage.

Return type

~_Self

member_type

alias of builtins.NoneType

property number_of_events

Number of events contained within a register instance.

Return type

int

property number_of_versions

Number of e-print versions contained within a register instance.

Return type

int

record_type

alias of arxiv.canonical.record.listing.RecordListing

save(s)[source]

Save this file.

Overrides the base method since this is a terminal record, not a collection.

Return type

str

class arxiv.canonical.register.RegisterListings(name, domain, record, integrity, members=None)[source]

Bases: arxiv.canonical.register.core.Base

domain_type

alias of arxiv.canonical.domain.listing.AllListings

integrity_type

alias of arxiv.canonical.integrity.listing.IntegrityListings

member_type

alias of RegisterListingYear

record_type

alias of arxiv.canonical.record.listing.RecordListings

class arxiv.canonical.register.RegisterListingDay(name, domain, record, integrity, members=None)[source]

Bases: arxiv.canonical.register.core.Base

add_listing(s, sources, d)[source]
Return type

None

domain_type

alias of arxiv.canonical.domain.listing.ListingDay

integrity_type

alias of arxiv.canonical.integrity.listing.IntegrityListingDay

classmethod load_event(s, sources, identifier)[source]
Return type

Event

member_type

alias of RegisterListing

record_type

alias of arxiv.canonical.record.listing.RecordListingDay

class arxiv.canonical.register.RegisterListingMonth(name, domain, record, integrity, members=None)[source]

Bases: arxiv.canonical.register.core.Base

domain_type

alias of arxiv.canonical.domain.listing.ListingMonth

integrity_type

alias of arxiv.canonical.integrity.listing.IntegrityListingMonth

member_type

alias of RegisterListingDay

record_type

alias of arxiv.canonical.record.listing.RecordListingMonth

class arxiv.canonical.register.RegisterListingYear(name, domain, record, integrity, members=None)[source]

Bases: arxiv.canonical.register.core.Base

domain_type

alias of arxiv.canonical.domain.listing.ListingYear

integrity_type

alias of arxiv.canonical.integrity.listing.IntegrityListingYear

member_type

alias of RegisterListingMonth

record_type

alias of arxiv.canonical.record.listing.RecordListingYear

class arxiv.canonical.register.RegisterMetadata(name, domain, record, integrity, members=None)[source]

Bases: arxiv.canonical.register.core.Base

delete(s)[source]
Return type

None

domain_type

alias of arxiv.canonical.domain.version.Version

integrity_type

alias of arxiv.canonical.integrity.metadata.IntegrityMetadata

member_type

alias of builtins.NoneType

record_type

alias of arxiv.canonical.record.metadata.RecordMetadata

save(s)[source]

Save this file.

Overrides the base method since this is a terminal record, not a collection.

Return type

str

class arxiv.canonical.register.RegisterMonth(name, domain, record, integrity, members=None)[source]

Bases: arxiv.canonical.register.core.Base

Representation of a month-block in the canonical register.

domain_type

alias of arxiv.canonical.domain.block.EPrintMonth

integrity_type

alias of arxiv.canonical.integrity.version.IntegrityMonth

member_type

alias of RegisterDay

record_type

alias of arxiv.canonical.record.version.RecordMonth

class arxiv.canonical.register.RegisterVersion(name, domain, record, integrity, members=None)[source]

Bases: arxiv.canonical.register.core.Base

classmethod create(s, sources, d, save_members=True)[source]
Return type

RegisterVersion

domain_type

alias of arxiv.canonical.domain.version.Version

integrity_type

alias of arxiv.canonical.integrity.version.IntegrityVersion

classmethod load(s, sources, identifier, checksum=None)[source]

Load an e-print Version from s.

This method is overridden since it uses a different member mapping struct than higher-level collection types.

Return type

~_Self

property member_names

Set[str]

Type

rtype

member_type

alias of arxiv.canonical.register.file.RegisterFile

property number_of_events

Number of events contained within a register instance.

Return type

int

property number_of_versions

Number of e-print versions contained within a register instance.

Return type

int

record_type

alias of arxiv.canonical.record.version.RecordVersion

save_members(s, members)[source]

Save members that have changed, and update our manifest.

Return type

None

update(s, sources, version)[source]

Update a version in place.

Removes any members (files) not in the passed Version, and retains and ignores members without any content (assumes that this is a partial update). Saves any new/changed members, and updates the manifest.

Return type

None

class arxiv.canonical.register.RegisterYear(name, domain, record, integrity, members=None)[source]

Bases: arxiv.canonical.register.core.Base

Representation of a year-block in the canonical register.

domain_type

alias of arxiv.canonical.domain.block.EPrintYear

integrity_type

alias of arxiv.canonical.integrity.version.IntegrityYear

member_type

alias of RegisterMonth

record_type

alias of arxiv.canonical.record.version.RecordYear