Simple Multi-stage Docker Builds

Cynnovative is committed to rapidly testing and deploying research prototypes. As such, we always look for ways to let our data scientists write code which is easily deployable and adheres to good engineering practices.

At Cynnovative, we use Docker for production deployments and model training on GPU-enabled hardware. Rather than moving code into a Docker container when a project is ready to be deployed or requires GPU support, it is better to design a project to run in a Docker container from the get-go. This reduces the headache of “Dockerizing” a project later, ensures that unit tests are fully reproducible, and makes PR reviews easier, since reviewers can just checkout a branch and run any Docker containers themselves.

Containerizing projects early also provides a straightforward build step with the same success criterion across projects. While a Java project may be in a good state if it compiles, there is no direct analogy in Python. Dockerizing everything, even for development work, provides a great equalizer in this regard: if your image does not build, your PR will not be approved.

When every project is Dockerized, setting up an automation pipeline for each project is simple. For this, we use Jenkins pipelines to check that the Docker build succeeds after all PRs and changes to the master branch.

To make this process seamless for our data scientists, we use a cookiecutter template that initializes a new Python package with a corresponding Dockerfile, Jenkinsfile, and test.sh script which defines a set of tests which must be run for a Docker build to succeed. This template serves as a reference for company best practices, but since test.sh can be customized, developers can define their own set of passing tests for a Docker build. For example, we always write unit tests, but we don’t believe that every Python project must use type annotations or check them using mypy.

The result of all this boilerplate is that data scientists can begin work on a new project, and, within a few minutes, have a structure in place to enable unit tests, linting, static type checking, auto-formatting, and a custom Docker image, plus a CI/CD pipeline to build their project and push it to an internal registry if all tests pass.

A key part of this is that tests are run as part of the Docker build process. Below, we explain how to write a Dockerfile which ensures the build process will fail if any tests do not succeed and assume familiarity with the basics of writing a Dockerfile and building a Docker image. Then, we demonstrate how we programmatically build these images as part of a Jenkins pipeline. Even if you have not used Jenkins pipelines before, Dockerizing everything and reusing a basic Jenkinsfile template makes it easy to set up a Jenkins pipeline for a brand new project.

Multi-stage Docker Builds for Seamless Testing of Built Docker Images

In order to ensure that Docker images are only built when all the tests for a project pass, we must run our test.sh script within a Docker container as a step in our Docker build process. It isn’t sufficient to run our tests in a second Dockerfile or in a running Docker container since there is no guarantee developers will always build or run that second container before building the main image. More specifically, you should always test what you aim to deploy, rather than a separate build artifact. The obvious solution is to copy over all test resources to the container, then simply run the tests as one step in the build process:

FROM python:3.7
 
WORKDIR /my-project
 
COPY tests tests
 
COPY src src
 
COPY requirements.txt setup.py setup.cfg test.sh entrypoint.py ./
 
RUN pip install .[dev]
 
RUN ./test.sh
 
ENTRYPOINT ["python", "entrypoint.py"]

This works in cases where your test suite is small, but it is a bad idea to use when your test suite is large, particularly if it contains a lot of data and test fixtures.

Thankfully, multi-stage builds provide an easy-to-use syntax for creating lightweight Docker images which avoid this type of bloat. In a multi-stage build, FROM statements indicate the unique stages of the build process. In order to create smaller images, each stage can copy from the previous stages only the artifacts that are strictly required for the final stage.

In our multi-stage builds, we typically only use three stages: base, test, and build.

In the base stage, we set our parent base image (e.g. some Python 3 image) and copy over our source code and other files required to install our project in the container¹:

# Base layer
FROM python:3.7 as base
WORKDIR /my-project
 
# Build base layer
COPY src src
COPY requirements.txt setup.py entrypoint.py ./
RUN pip install .

Next, we copy over our test suite and any other resources that are required, e.g. a configuration file that may dictate how your testing, linting, and type checking tools are run²:

# Build test layer, which may fail if tests fail
FROM base as test
COPY setup.cfg test.sh ./
COPY tests tests
RUN pip install .[dev]
RUN ./test.sh

At this point, if our tests fail, the entire Docker build process will fail.

Lastly, we simply define a build stage which only inherits from the base stage. This ensures that the test suite and config files are not included in the final build:

# Start from the base layer, disregarding the test layer
FROM base as build
ENTRYPOINT ["python", "entrypoint.py"]

We have found this to be the most straightforward way to ensure that tests are run as part of the build process. Even when developing simple projects which are unlikely to fail in a different environment, it is useful to check that our Docker container builds locally, since it ensures that all of our tests are run in a consistent environment and we keep track our project’s precise dependencies.

Recently, Docker 19.03 introduced “BuildKit,” which brings a lot of changes to the Docker Build process. Among them is the ability to detect and skip the building of unused build stages. Since our test stage, above, is not strictly a dependency of the build stage, systems with Docker BuildHit enabled may build a container even if it includes a test stage with failing tests.

To overcome this issue, you may choose to make the build stage trivially dependent on the test stage. In the test stage, create a small plain text file, then copy it over as a dependency of the build stage:

RUN ./test.sh
RUN echo "Tests successful!" > /.test_stage_output
 
FROM base as build
COPY --from=test /test_stage_output /test_stage_output

We include build.sh and run.sh scripts to control how we build and run our Docker images locally. In practice, instead of making the build stage dependent on the test stage, we just use the –target command line option and make sure that we can build to the test stage before we build the full image:

#!/bin/bash
set -e
 
docker build --target test .
docker build -t cynnovative/my-project:latest .

A Jenkins Pipeline for Automatically Testing, Building, and Deploying Docker Images

Even the best tests are useless if they aren’t run automatically upon proposed changes to the master branch of a project. In order to ensure code changes do not break the Docker build, we use Jenkins multibranch pipelines to test the Docker build when a new PR is opened. Defining these pipelines with a Jenkinsfile in the root of the project repository ensures that all project build steps are reproducible and tied to the Git branch in use.

Below is the setup section of our Jenkinsfile. We set the agent to “none”, which means each stage will define what docker image it will use. We also set environment variables, in this case the IMAGE_NAME and VERSION, which will be used to determine how our Docker image will be identified in our internal Docker registry. Defining the image name and version in the environment section allows this Jenkinsfile to be used as a template across projects.

pipeline {
   agent none
   environment {
       IMAGE_NAME = 'my-project'
       VERSION = '1.0.0'
   }
}

In the Jenkins test stage below, the agent is set to any as the step only requires Docker to be installed. The withDockerRegistry method points to our internal Docker registry using the global environment variables DOCKER_PROTOCOL and DOCKER_REGISTRY and uses our local-docker-registry Jenkins credentials for authentication. In order to guarantee the tests are run, we call Docker build with the ‘test’ target to build to the test stage in the Dockerfile.

stage('Test') {
   agent any
   steps {
       script {
           withDockerRegistry([url: 
               "${env.DOCKER_PROTOCOL}://${env.DOCKER_REGISTRY}",  
               credentialsId: "local-docker-registry" ]) 
           {
               def testImage = docker.build(                       "${env.DOCKER_REGISTRY}/${IMAGE_NAME}-test:${VERSION}.${env.BUILD_ID}", 
"--target test -f Dockerfile .")
           }
       }
   }
}

In the Deliver stage, we do not specify a target: this Jenkins stage builds the entire image. The BRANCH_NAME environment gives the branch name which triggered the pipeline. If the triggering branch is the master branch, the newly built Docker image will be pushed to our internal registry so that it may be used on any of our infrastructure. This same logic could also be used for pushing the Python packages to a PyPi server or for pushing Java artifacts to a maven repository.

stage('Deliver') {
   agent any
   steps {
       script {
           withDockerRegistry([url: 
               "${env.DOCKER_PROTOCOL}://${env.DOCKER_REGISTRY}",  
               credentialsId: "local-docker-registry" ]) {
               def buildImage = docker.build(
                   "${env.DOCKER_REGISTRY}/${IMAGE_NAME}:${VERSION}.${env.BUILD_ID}")
               if (env.BRANCH_NAME == 'master') {
                   buildImage.push()
                   buildImage.push("latest")
               }
           }
       }
   }
}

¹In practice, it is useful to COPY over your ‘requirements.txt’ and install them before you install your package. This is redundant, but it speeds up subsequent builds when the only thing that has changed in your project is some source code.

²Our development dependencies are specified in ‘setup.py’, to ensure that anyone working on the project is using the same version of all of our formatting, testing, and linting tools.