Using Docker Multi-Stage Builds to Optimise your build pipeline(s)

A slack message showing vulnerabilities

Quite recently I discovered the power of using multi-stage docker builds with the output argument.

This was significant because it enabled me to remove a lot of duplication that occurred between my Dockerfile build process, and the processes that I ran on my build agents helping towards making a 75% time saving on a process where I wanted to get quicker feedback, the release and pull-request app build.

My process looked something like this:

Pipeline	Docker
Install third-party libraries
Build or publish an application
Lint
Unit Test
Generate sourcemaps
Build and Publish Docker Images
	Install third-party libraries
	Build or publish an application
	Copy assets into a runtime stage

Why I had duplication

I needed to run the unit tests so that I could take the output files which included code coverage and the test results to then display them in the build results tabs (Coverage / Tests) in Azure DevOps.

I need to generate sourcemaps on the build agent so that I could upload them to Application Insights to unminimise my front-end source when viewing errors, which was achieved using a build task executing powershell.

Both of the above required me to restore packages and build the applications on the pipeline to have access to the output files… Coincidentally restoring and building the apps are the slowest steps.

I also needed to build Docker images that could be used to run my apps. However, the Docker images also needed to restore npm or nuget and build a clientside app, or publish a dotnet app.

I could have just done the build in the pipeline, and copied the build assets into the Dockerfile, but this felt wrong, I wanted a Dockerfile that was self contained and could be run locally with identical results to my build pipeline.

Introducing Multi Stage Docker Builds

I introduced multi-stage docker builds so that I could export the assets that I needed, I added an export stage, which was after my build and run stages.

It used a scratch docker image, which is a special empty image. When you use --output in Docker it will output ALL of the files in that stage, if that stage was using something like a Linux base image it would output all of the Linux files alongside my sourcemap files, which was slow, and unnecessary.

So what does my Dockerfile look like now, and how does it work?

 1# 1. Build the app
 2FROM node:16.17-alpine as build
 3WORKDIR /usr/src/app
 4
 5COPY package.json .
 6RUN npm ci
 7COPY . .
 8
 9RUN npm run lint
10RUN npm run buildProd
11
12# 2. Run the app
13FROM nginx:1.23-alpine-slim as run
14COPY --from=build /usr/src/app/dist /usr/share/nginx/html
15
16# 3. Export sourcemaps
17FROM scratch as export
18COPY --from=build /usr/src/app/dist/*.map /maps/

This does an npm restore and build in a build stage, and then it has two stages that are dependent on that stage, run, and export. If I want to run the app I build the run stage, if I want to export my sourcemap files I run the export stage. In my scenario I want both, but won’t this be slow because it will need to run the build stage twice?

It doesn’t because I make use of docker caching. Which you’ll see in my Azure DevOps pipeline tasks below:

 1- task: Docker@2
 2  displayName: "Build Runtime Docker Image"
 3  inputs:
 4    command: build
 5    containerRegistry: "my-registry"
 6    repository: "hello-world"
 7    tags: $(Build.BuildNumber)
 8    dockerfile: myapp.dockerfile
 9    buildContext: .
10    arguments: --target run
11- task: Docker@2
12  displayName: "Export Sourcemaps"
13  inputs:
14    command: build
15    containerRegistry: "my-registry"
16    repository: "hello-world"
17    dockerfile: myapp.dockerfile
18    buildContext: .
19    arguments: >
20        --cache-from=hello-world:    
21        --target export
22        --output my-output-maps-path

The important lines to note are, on line 7 I tag the image, which I then use in --cache-from source on line 20, which ensures I don’t restore and build the app again.

On lines 10 and 21 I set the docker --target, which is telling docker which stage I would like to build.

On line 22 I set the --output argument to a directory on my build agent, which will output the files from the export stage to this directory.

Making this change, combined with running parts of my pipeline in parallel cut the build time down from 30 minutes to around 8 minutes.

Why I had duplication#

Introducing Multi Stage Docker Builds#

Why I had duplication

Introducing Multi Stage Docker Builds