.. _development-practices: Development Practices ********************* This section describes development practices used across all arXiv-NG projects, including code management, application versioning, QA/QC, CI/CD, and documentation. Code management =============== Source code and attendant documentation for each service will be kept under version control in its own Git repository, hosted on GitHub. Except where impracticable new GitHub repositories should be public, and should include a copy of the MIT license (e.g. see https://github.com/arxiv/arxiv-zero/blob/master/LICENSE). Repositories containing code for the classic arXiv system **must remain private** due to security concerns. Branch Management ----------------- We use the `Gitflow branching model `_ to manage concurrent work within each repository. In brief, each repository has a ``master`` branch that contains the latest stable version of the application, a ``develop`` branch that contains additional work not yet released, and feature branches that contain work on a specific task or story. Feature branches are named based on the corresponding ticket in the ARXIVNG or ARXIVDEV JIRA project: ``[type]/[project]-[number]``. For example, ``story/ARXIVDEV-4092``. JIRA ticket numbers (eg. ``ARXIVDEV-1234``) should also be included in commit messages, especially when the ticket number is different from that named in the feature branch. Delivering Work --------------- Feature branches are not merged directly into the develop branch, nor is the develop branch merged directly into the master branch. Instead, the developer responsible for delivering the changes in question raises a pull request, which (except in rare cases) are subject to review by at least one other member of the dev team. Pull requests must also pass all automated tests and quality checks; see :ref:`testing-and-qa`. Tagging ------- Application versions are commemorated using `annotated tags `_ on the master branch. A tag is applied only after the prospective release has been staged and verified for deployment. Tags should contain only the version number. See :ref:`versioning` for details. .. _versioning: Versioning ========== arXiv-NG services are versioned independently, using `semantic versioning `_. In brief: - Major versions commemorate incompatible API changes; - Minor versions commemorate new functionality; - Patch versions commemorate bug fixes. Version numbers used in tags are "bare"; i.e. they contain only the version number itself without any prefixes. For example: ``1.4.3``. JIRA Releases ------------- Work tickets are added to release in JIRA using the "Fix/Version" field. JIRA releases are labeled with both the service slug and the semantic version. For example, JIRA releases for the ``arxiv-fulltext`` service are labeled with ``fulltext-MAJOR.MINOR.PATCH``. Versioning arXiv-NG as a whole ------------------------------ We may decide to version the entire arXiv-NG system. This will implement some kind of romantic versioning scheme. .. todo:: We should decide how we want to do this. .. _releases: Release process --------------- 1. At a sprint meeting, the arXiv team decides what constitutes a versionable release. Those tickets are added to a JIRA release using the ``Fix/Version`` metadata field. 2. When all of the tickets in the release are ``Done``, a PR is raised from the develop branch to the master branch. 3. In addition to automated tests, the release candidate is staged in the ``staging`` namespace of the Kubernetes cluster: - Travis-CI automatically builds Docker image(s) for the service, tags them with ``develop``, and pushes them to the Docker registry (ECR). - Travis-CI patches the Kubernetes/Helm deployment(s) in the ``staging`` namespace with the new Docker image(s). - Automated and manual tests are performed. - If tests fail, additional commits are added to the PR. This process is repeated until all tests pass. 4. When all tests pass: - The PR is merged. - An annotated tag is added to the merge commit on the master branch (bare version number only). - The JIRA version is "released", and the release notes are added to the tag/release comments on GitHub. - Upon pushing the tag on the master branch, Travis-CI builds the Docker image(s) for the service, adds version tags to the image(s) (``M.m.p``, ``M.m``, ``M``, and ``latest``), pushes the images to the Docker repository, and patches the Kubernetes/Helm deployment in the production namespace via the Kubernetes API. 5. The production deployment is verified with automated and/or manual tests. In the regrettable (and hopefully rare) case that a production deployment fails, it is rolled back via the Kubernetes API. .. _testing-and-qa: Testing & QA ============ This section describes testing practices and procedures to be implemented across all arXiv-NG projects. .. _unit-tests: Unit tests ---------- Unit, integration, and end-to-end tests should be written using the built-in `unittest module `_. `Nose2 `_ is the preferred test runner. `Coverage `_ should also be installed to check test coverage. Nose2 should automatically discover your unit tests. From the root of the project repository, run: .. code-block:: bash nose2 --with-coverage The minimum test coverage target is 90%. Test coverage should not decrease by more than 5% in a given PR. .. _integration-tests: Integration tests ----------------- Integration tests should be used to test :ref:`service modules `. Ideally, these tests will use the "real" service with which the module integrates. For integration tests involving AWS resources, the `localstack `_ project is invaluable. To test integrations with other arXiv-NG services, the latest Docker image for that service can be pulled from the Docker registry (ECR). For an example of an integration test that uses localstack via Docker, see `this test module `_. .. todo:: Consider how we can leverage Swagger with Flex for testing integrations with other arXiv-NG services. End-to-end tests ---------------- End-to-end tests should make use of `Docker Compose `_. If the subsystem/project involves multiple constituent services, the docker-compose configuration should build and start all of those constituents, and pull in images for any additional services needed for integrations. See also `this example docker-compose.yml `_. A separate image (e.g. defined by a separate dockerfile, Dockerfile-tests) can then be used to effect the e2e tests, for example by exercising service APIs or generating notifications. See `this example Dockerfile `_. .. _typing: Static analysis & type annotations ---------------------------------- We use `type hint annotations `_ throughout the codebase. Although Python is not a statically typed language, type hints emerged in Python 3 as a mechanism for documenting code (specifically, function behavior) and to introduce some of the benefits of static typing -- specifically, the ability to analyze code for programming errors without having to actually execute that code. While this does not obviate the need for comprehensive unit tests, it does provide another layer of quality assurance to supplement those tests. Type hints may be especially useful when defining the `core data domain ` of a service, in cases where full-fledged classes would be overkill. We use `mypy `_ for static analysis. Ideally mypy will pass without any errors. Judgment is exercised to exclude code from type checking by mypy. mypy chokes on dynamic base classes and proxy objects (which you're likely to encounter using Flask); it's perfectly fine to disable checking on those offending lines using "``# type: ignore``". For example: .. code-block:: python g.baz = get_session(app) # type: ignore See `this issue `_ for more information. A ``mypy.ini`` file may be included in the root of the repository. See for example: https://github.com/arxiv/arxiv-zero/blob/develop/mypy.ini .. _code-quality-linting: Code quality & linting ---------------------- All new code should adhere as closely as possible to PEP008. Use the `Numpy style `_ for docstrings. Use `Pylint `_ to check your code prior to raising a pull request. The parameters below will be used when checking code cleanliness on commits, PRs, and tags, with a target score of >= 9/10. If you're using Atom as your text editor, consider using the `linter-pylama `_ package for real-time feedback. We currently ignore the following flags (subject to change): - W0622: Redefining built-in %r - W0611: Unused import %s - F0401: Unable to import %s - R0914: Too many local variables (%s/%s) - W0221: Arguments number differs from %s method - W0222: Signature differs from %s method - W0142: Used * or ** magic - F0010: error while code parsing: %s - W0703: Catching too general exception %s - R0911: Too many return statements (%s/%s) - C0103: Invalid %s name "%s" - R0913: Too many arguments (%s/%s) .. code-block:: bash $ pylint --disable=W0622,W0611,F0401,R0914,W0221,W0222,W0142,F0010,W0703,R0911,C0103,R0913 -f parseable zero No config file found, using default configuration ************* Module zero.context zero/context.py:10: [W0212(protected-access), get_application_config] Access to a protected member _Environ of a client class ************* Module zero.encode zero/encode.py:11: [E0202(method-hidden), ISO8601JSONEncoder.default] An attribute defined in json.encoder line 158 hides this method ************* Module zero.controllers.baz zero/controllers/baz.py:1: [C0102(blacklisted-name), ] Black listed name "baz" ************* Module zero.services.baz zero/services/baz.py:1: [C0102(blacklisted-name), ] Black listed name "baz" ************* Module zero.services.things zero/services/things.py:11: [R0903(too-few-public-methods), Thing] Too few public methods (0/2) zero/services/things.py:49: [E1101(no-member), get_a_thing] Instance of 'scoped_session' has no 'query' member ------------------------------------------------------------------ Your code has been rated at 9.49/10 (previous run: 9.41/10, +0.07) .. _travis-ci-cd: Continuous Integration/Continuous Delivery ------------------------------------------ We use `Travis-CI `_ to perform automated tests and to automatically build and deploy services in staging and production. Each project contains a ``.travis.yml`` configuration file that describes the build process. For example, see: https://github.com/arxiv/arxiv-zero/blob/master/.travis.yml .. todo:: Add build/deploy to arxiv-zero travis config. Travis may also trigger pylint and mypy checks, using a script like `this one `_. .. todo:: Better documentation for the linstats.sh example, including config params that need to be set in Travis-CI. Travis reports the success or failure of the build process to GitHub, for use in pull requests. Upon completion, Travis triggers test coverage analysis by `Coveralls `_, which evaluates test coverage targets and reports the result to GitHub. When a PR is raised from develop to master, Travis builds and pushes the Docker image(s) for the service, and deploys to staging using the Kubernetes API. When a tag is pushed on the master branch, Travis builds and pushes the Docker image(s) for the service, and deploys to production using the Kubernetes API. See also :ref:`releases`. Documentation ============= Most documentation (including this document) is written using `reStructedText `_ markdown, which we build to HTML and/or PDF with `Sphinx `_. Documentation for each project/service should be stored in a ``docs`` folder in the repository root. .. _architectural-documentation: Architectural documentation --------------------------- Each service/project should include an ``architecture.rst`` file that describes what the service does, how it's built, and any significant technical decisions that have been made in the course of its development. This architecture documentation is based on the `arc42 documentation model `_, and also draws heavily on the `C4 software architecture model `_. The C4 model describes an architecture at four hierarchical levels, from the business context of the system to the internal architecture of small parts of the system. For example, see: https://github.com/arxiv/arxiv-zero/blob/master/docs/source/architecture.rst In document for arXiv NG services, we have departed slightly from the original language of C4 in order to avoid collision with names in adjacent domains. Specifically, we describe the system at three levels: - Context: This includes both the business and technical contexts in the arc42 model. It describes the interactions between a service and other services and systems. - Building block: This is similar to the "container" concept in the C4 model. A building block is a part of the system that is developed, tested, and deployed quasi-independently. This might be a single application, or a data store. - Component: A component is an internal part of a building block. In the case of a Flask application, this might be a module or submodule that has specific responsibilities, behaviors, and interactions. .. _code-api-documentation: Code API documentation ---------------------- Documentation for the (code) API is generated automatically with `sphinx-apidoc `_, and lives in ``docs/source/api``. sphinx-apidoc generates references to modules in the code, which are followed at build time to retrieve docstrings and other details. This means that you won't need to run sphinx-apidoc unless the structure of the project changes (e.g. you add/rename a module). To rebuild the API docs, run (from the project root): .. code-block:: bash sphinx-apidoc -M -f -o docs/source/api/ foo Docstrings should be written in the `Numpy style `_. .. _rest-api-documentation: REST API documentation ---------------------- Both internal and external APIs should be documented using the `OpenAPI `_ specification (aka Swagger). A separate API description should be provided for the internal and external APIs. In addition, `JSON Schema `_ should be provided for each endpoint and referenced from the OpenAPI/Swagger description. These documents should describe both response and (as appropriate) request payloads. API documentation will also be aggregated across subsystems for inclusion in the API consumer portal. See :ref:`schema`.