.. _development-practices:

Development Practices
*********************

This section describes development practices used across all arXiv-NG projects,
including code management, application versioning, QA/QC, CI/CD, and
documentation.

Code management
===============
Source code and attendant documentation for each service will be kept under
version control in its own Git repository, hosted on GitHub. Except where
impracticable new GitHub repositories should be public, and should include a
copy of the MIT license (e.g. see
https://github.com/arxiv/arxiv-zero/blob/master/LICENSE). Repositories
containing code for the classic arXiv system **must remain private** due to
security concerns.

Branch Management
-----------------
We use the `Gitflow branching model
<https://www.atlassian.com/git/tutorials/comparing-workflows#gitflow-workflow>`_
to manage concurrent work within each repository. In brief, each repository
has a ``master`` branch that contains the latest stable version of the
application, a ``develop`` branch that contains additional work not yet
released, and feature branches that contain work on a specific task or story.

Feature branches are named based on the corresponding ticket in the ARXIVNG
or ARXIVDEV JIRA project: ``[type]/[project]-[number]``. For example,
``story/ARXIVDEV-4092``. JIRA ticket numbers (eg. ``ARXIVDEV-1234``) should
also be included in commit messages, especially when the ticket number is
different from that named in the feature branch.

Delivering Work
---------------
Feature branches are not merged directly into the develop branch, nor is the
develop branch merged directly into the master branch. Instead, the developer
responsible for delivering the changes in question raises a pull request, which
(except in rare cases) are subject to review by at least one other member of
the dev team. Pull requests must also pass all automated tests and quality
checks; see :ref:`testing-and-qa`.

Tagging
-------
Application versions are commemorated using `annotated tags
<https://git-scm.com/book/en/v1/Git-Basics-Tagging>`_ on the master branch. A
tag is applied only after the prospective release has been staged and verified
for deployment. Tags should contain only the version number. See
:ref:`versioning` for details.

.. _versioning:

Versioning
==========
arXiv-NG services are versioned independently, using `semantic versioning
<https://semver.org/>`_. In brief:

- Major versions commemorate incompatible API changes;
- Minor versions commemorate new functionality;
- Patch versions commemorate bug fixes.

Version numbers used in tags are "bare"; i.e. they contain only the version
number itself without any prefixes. For example: ``1.4.3``.

JIRA Releases
-------------
Work tickets are added to release in JIRA using the "Fix/Version" field. JIRA
releases are labeled with both the service slug and the semantic version.
For example, JIRA releases for the ``arxiv-fulltext`` service are labeled with
``fulltext-MAJOR.MINOR.PATCH``.

Versioning arXiv-NG as a whole
------------------------------
We may decide to version the entire arXiv-NG system. This will implement some
kind of romantic versioning scheme.

.. todo::

   We should decide how we want to do this.


.. _releases:

Release process
---------------
1. At a sprint meeting, the arXiv team decides what constitutes a versionable
   release. Those tickets are added to a JIRA release using the ``Fix/Version``
   metadata field.
2. When all of the tickets in the release are ``Done``, a PR is raised from
   the develop branch to the master branch.
3. In addition to automated tests, the release candidate is staged in the
   ``staging`` namespace of the Kubernetes cluster:

   - Travis-CI automatically builds Docker image(s) for the service, tags them
     with ``develop``, and pushes them to the Docker registry (ECR).
   - Travis-CI patches the Kubernetes/Helm deployment(s) in the ``staging``
     namespace with the new Docker image(s).
   - Automated and manual tests are performed.
   - If tests fail, additional commits are added to the PR. This process is
     repeated until all tests pass.

4. When all tests pass:

   - The PR is merged.
   - An annotated tag is added to the merge commit on the master branch (bare
     version number only).
   - The JIRA version is "released", and the release notes are added to the
     tag/release comments on GitHub.
   - Upon pushing the tag on the master branch, Travis-CI builds the Docker
     image(s) for the service, adds version tags to the image(s) (``M.m.p``,
     ``M.m``, ``M``, and ``latest``), pushes the images to the Docker
     repository, and patches the Kubernetes/Helm deployment in the production
     namespace via the Kubernetes API.

5. The production deployment is verified with automated and/or manual tests. In
   the regrettable (and hopefully rare) case that a production deployment
   fails, it is rolled back via the Kubernetes API.


.. _testing-and-qa:

Testing & QA
============
This section describes testing practices and procedures to be implemented
across all arXiv-NG projects.

.. _unit-tests:

Unit tests
----------
Unit, integration, and end-to-end tests should be written using the built-in
`unittest module <https://docs.python.org/3/library/unittest.html>`_.

`Nose2 <http://nose2.readthedocs.io/en/latest/>`_ is the preferred test runner.
`Coverage <http://coverage.readthedocs.io/en/latest/>`_ should also be
installed to check test coverage. Nose2 should automatically discover your
unit tests. From the root of the project repository, run:

.. code-block:: bash

   nose2 --with-coverage

The minimum test coverage target is 90%. Test coverage should not decrease by
more than 5% in a given PR.

.. _integration-tests:

Integration tests
-----------------
Integration tests should be used to test :ref:`service modules
<service-modules>`. Ideally, these tests will use the "real" service with
which the module integrates. For integration tests involving AWS resources,
the `localstack <https://github.com/localstack/localstack>`_ project is
invaluable. To test integrations with other arXiv-NG services, the latest
Docker image for that service can be pulled from the Docker registry (ECR).

For an example of an integration test that uses localstack via Docker, see
`this test module <https://github.com/arxiv/arxiv-references/blob/61b918cb71a99897b79ac1b3273bc5b246f5c3b2/tests/integration/test_datastore_integrations.py>`_.

.. todo::

   Consider how we can leverage Swagger with Flex for testing integrations
   with other arXiv-NG services.


End-to-end tests
----------------
End-to-end tests should make use of `Docker Compose
<https://docs.docker.com/compose/>`_.

If the subsystem/project involves multiple constituent services, the
docker-compose configuration should build and start all of those constituents,
and pull in images for any additional services needed for integrations.
See also `this example docker-compose.yml
<https://github.com/arxiv/arxiv-references/blob/61b918cb71a99897b79ac1b3273bc5b246f5c3b2/docker-compose.yml>`_.

A separate image (e.g. defined by a separate dockerfile, Dockerfile-tests)
can then be used to effect the e2e tests, for example by exercising service
APIs or generating notifications. See `this example Dockerfile
<https://github.com/arxiv/arxiv-references/blob/61b918cb71a99897b79ac1b3273bc5b246f5c3b2/Dockerfile-test>`_.


.. _typing:

Static analysis & type annotations
----------------------------------
We use `type hint annotations <https://docs.python.org/3/library/typing.html>`_
throughout the codebase. Although Python is not a statically typed language,
type hints emerged in Python 3 as a mechanism for documenting code
(specifically, function behavior) and to introduce some of the benefits of
static typing -- specifically, the ability to analyze code for programming
errors without having to actually execute that code. While this does not
obviate the need for comprehensive unit tests, it does provide another layer
of quality assurance to supplement those tests.

Type hints may be especially useful when defining the `core data domain
<data-domain-modules>` of a service, in cases where full-fledged classes would
be overkill.

We use `mypy <http://mypy-lang.org/>`_ for static analysis. Ideally mypy will
pass without any errors. Judgment is exercised to exclude code from type
checking by mypy. mypy chokes on dynamic base classes and proxy objects (which
you're likely to encounter using Flask); it's perfectly fine to disable
checking on those offending lines using "``# type: ignore``". For example:

.. code-block:: python

   g.baz = get_session(app) # type: ignore


See `this issue <https://github.com/python/mypy/issues/500>`_ for more
information.

A ``mypy.ini`` file may be included in the root of the repository. See for
example: https://github.com/arxiv/arxiv-zero/blob/develop/mypy.ini


.. _code-quality-linting:

Code quality & linting
----------------------
All new code should adhere as closely as possible to PEP008.

Use the `Numpy style
<https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt>`_ for
docstrings.

Use `Pylint <https://www.pylint.org/>`_ to check your code prior to raising a
pull request. The parameters below will be used when checking code  cleanliness
on commits, PRs, and tags, with a target score of >= 9/10.

If you're using Atom as your text editor, consider using the `linter-pylama
<https://atom.io/packages/linter-pylama>`_ package for real-time feedback.

We currently ignore the following flags (subject to change):

- W0622: Redefining built-in %r
- W0611: Unused import %s
- F0401: Unable to import %s
- R0914: Too many local variables (%s/%s)
- W0221: Arguments number differs from %s method
- W0222: Signature differs from %s method
- W0142: Used * or ** magic
- F0010: error while code parsing: %s
- W0703: Catching too general exception %s
- R0911: Too many return statements (%s/%s)
- C0103: Invalid %s name "%s"
- R0913: Too many arguments (%s/%s)


.. code-block:: bash

   $ pylint --disable=W0622,W0611,F0401,R0914,W0221,W0222,W0142,F0010,W0703,R0911,C0103,R0913 -f parseable zero
   No config file found, using default configuration
   ************* Module zero.context
   zero/context.py:10: [W0212(protected-access), get_application_config] Access to a protected member _Environ of a client class
   ************* Module zero.encode
   zero/encode.py:11: [E0202(method-hidden), ISO8601JSONEncoder.default] An attribute defined in json.encoder line 158 hides this method
   ************* Module zero.controllers.baz
   zero/controllers/baz.py:1: [C0102(blacklisted-name), ] Black listed name "baz"
   ************* Module zero.services.baz
   zero/services/baz.py:1: [C0102(blacklisted-name), ] Black listed name "baz"
   ************* Module zero.services.things
   zero/services/things.py:11: [R0903(too-few-public-methods), Thing] Too few public methods (0/2)
   zero/services/things.py:49: [E1101(no-member), get_a_thing] Instance of 'scoped_session' has no 'query' member

   ------------------------------------------------------------------
   Your code has been rated at 9.49/10 (previous run: 9.41/10, +0.07)


.. _travis-ci-cd:

Continuous Integration/Continuous Delivery
------------------------------------------
We use `Travis-CI <https://travis-ci.org/>`_ to perform automated tests and to
automatically build and deploy services in staging and production.

Each project contains a ``.travis.yml`` configuration file that describes the
build process. For example, see: https://github.com/arxiv/arxiv-zero/blob/master/.travis.yml

.. todo::

   Add build/deploy to arxiv-zero travis config.

Travis may also trigger pylint and mypy checks, using a script like `this one
<https://github.com/arxiv/arxiv-zero/blob/master/lintstats.sh>`_.

.. todo::

   Better documentation for the linstats.sh example, including config params
   that need to be set in Travis-CI.

Travis reports the success or failure of the build process to GitHub, for use
in pull requests.

Upon completion, Travis triggers test coverage analysis by `Coveralls
<https://coveralls.io/>`_, which evaluates test coverage targets and reports
the result to GitHub.

When a PR is raised from develop to master, Travis builds and pushes the Docker
image(s) for the service, and deploys to staging using the Kubernetes API.

When a tag is pushed on the master branch, Travis builds and pushes the  Docker
image(s) for the service, and deploys to production using the Kubernetes API.

See also :ref:`releases`.

Documentation
=============
Most documentation (including this document) is written using `reStructedText
<http://www.sphinx-doc.org/en/stable/rest.html>`_ markdown, which we build to
HTML and/or PDF with `Sphinx
<http://www.sphinx-doc.org/en/stable/index.html>`_.

Documentation for each project/service should be stored in a ``docs`` folder
in the repository root.

.. _architectural-documentation:

Architectural documentation
---------------------------
Each service/project should include an ``architecture.rst`` file that describes
what the service does, how it's built, and any significant technical decisions
that have been made in the course of its development.

This architecture documentation is based on the `arc42 documentation model
<http://arc42.org/>`_, and also draws heavily on the `C4 software architecture
model <https://www.structurizr.com/help/c4%3E>`_. The C4 model describes an
architecture at four hierarchical levels, from the business context of the
system to the internal architecture of small parts of the system.

For example, see: https://github.com/arxiv/arxiv-zero/blob/master/docs/source/architecture.rst

In document for arXiv NG services, we have departed slightly from the original
language of C4 in order to avoid collision with names in adjacent domains.
Specifically, we describe the system at three levels:

- Context: This includes both the business and technical contexts in the arc42
  model. It describes the interactions between a service and other services and
  systems.
- Building block: This is similar to the "container" concept in the C4 model.
  A building block is a part of the system that is developed, tested, and
  deployed quasi-independently. This might be a single application, or a data
  store.
- Component: A component is an internal part of a building block. In the case
  of a Flask application, this might be a module or submodule that has specific
  responsibilities, behaviors, and interactions.


.. _code-api-documentation:

Code API documentation
----------------------
Documentation for the (code) API is generated automatically with `sphinx-apidoc
<http://www.sphinx-doc.org/en/stable/man/sphinx-apidoc.html>`_, and lives in
``docs/source/api``.

sphinx-apidoc generates references to modules in the code, which are followed
at build time to retrieve docstrings and other details. This means that you
won't need to run sphinx-apidoc unless the structure of the project changes
(e.g. you add/rename a module).

To rebuild the API docs, run (from the project root):

.. code-block:: bash

   sphinx-apidoc -M -f -o docs/source/api/ foo

Docstrings should be written in the `Numpy style
<https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt>`_.


.. _rest-api-documentation:

REST API documentation
----------------------
Both internal and external APIs should be documented using the
`OpenAPI <https://www.openapis.org/>`_ specification (aka Swagger). A separate
API description should be provided for the internal and external APIs.

In addition, `JSON Schema <http://json-schema.org/>`_ should be provided for
each endpoint and referenced from the OpenAPI/Swagger description. These
documents should  describe both response and (as appropriate) request payloads.

API documentation will also be aggregated across subsystems for inclusion in
the API consumer portal.

See :ref:`schema`.