compiler.services.store package

Content store for compiled representation of paper.

Uses S3 as the underlying storage facility.

The intended use pattern is that a client (e.g. API controller) can check for a compilation using the source ID (e.g. file manager source_id), the format, and the checksum of the source package (as reported by the FM service) before taking any IO-intensive actions. See Store.get_status().

Similarly, if a client needs to verify that a compilation product is available for a specific source checksum, they would use Store.get_status() before calling Store.retrieve(). For that reason, Store.retrieve() is agnostic about checksums. This cuts down on an extra GET request to S3 every time we want to get a compiled resource.

exception compiler.services.store.DoesNotExist[source]

Bases: RuntimeError

The requested content does not exist.

class compiler.services.store.Store(buckets, verify=False, region_name=None, endpoint_url=None, aws_access_key_id=None, aws_secret_access_key=None)[source]

Bases: object

Represents an object store session.

KEY = '{src_id}/{chk}/{out_fmt}/{src_id}.{ext}'
LOG_KEY = '{src_id}/{chk}/{out_fmt}/{src_id}.{ext}.log'
STATUS_KEY = '{src_id}/{chk}/{out_fmt}/status.json'
create_bucket()[source]

Create S3 buckets. This is just for testing.

Return type

None

classmethod current_session()[source]

Get the current store session for this application.

Return type

Store

classmethod get_session()[source]

Create a new botocore.client.S3 session.

Return type

Store

get_status(src_id, chk, out_fmt, bucket='arxiv')[source]

Get the status of a compilation.

Parameters
  • src_id (str) – The unique identifier of the source package.

  • out_fmt (str) – Compilation format. See Format.

  • chk (str) – Base64-encoded MD5 hash of the source package.

  • bucket (str) –

Returns

Return type

Task

Raises

DoesNotExist – Raised if no status exists for the provided parameters.

Return type

Task

classmethod init_app(app)[source]

Set defaults for required configuration parameters.

Return type

None

is_available()[source]

Check whether we can write to the S3 buckets.

Return type

bool

retrieve(src_id, chk, out_fmt, bucket='arxiv')[source]

Retrieve a compilation product.

Parameters
  • src_id (str) –

  • chk (str) –

  • out_fmt (enum) – One of Format.

  • bucket (str) – Default is 'arxiv'. Used in conjunction with buckets to determine the S3 bucket from which the content should be retrieved

Returns

Return type

Product

Return type

Product

retrieve_log(src_id, chk, out_fmt, bucket='arxiv')[source]

Retrieve a compilation log.

Parameters
  • src_id (str) –

  • chk (str) –

  • out_fmt (enum) – One of Format.

  • bucket (str) – Default is 'arxiv'. Used in conjunction with buckets to determine the S3 bucket from which the content should be retrieved

Returns

Return type

Product

Return type

Product

set_status(task, bucket='arxiv')[source]

Update the status of a compilation.

Parameters
  • task (Task) –

  • bucket (str) –

Return type

None

store(product, bucket='arxiv')[source]

Store a compilation product.

Parameters
  • product (Product) –

  • bucket (str) – Default is 'arxiv'. Used in conjunction with buckets to determine the S3 bucket where this content should be stored.

Return type

None

store_log(product, bucket='arxiv')[source]

Store a compilation log.

Parameters
  • product (Product) – Stream should be log content.

  • bucket (str) – Default is 'arxiv'. Used in conjunction with buckets to determine the S3 bucket where this content should be stored.

Return type

None

compiler.services.store.hash_content(body)[source]

Generate an encoded MD5 hash of a bytes.

Return type

str