This is my dilemma:
- Implemented layer caching in our Azure DevOps Pipelines using multi-stage
Dockerfiles
(e.g.,development:latest
is a dependency of theunit-test
andintegration-test
stages). - To get this to work properly, I had to add image tags to images that are a dependency of another stage, otherwise layer caching would ignore it and rebuild the stage. ADO won't let you build an image without a tag and if you don't provide one it just uses
$(Build.BuildId)
or something. - Adding the tags is now breaking the local dev environments using
devspace
because now it sees thedevelopment
stage has a dependency likeCOPY --from=builder-base:latest
and tries to pull it from a remote repo instead of just building it locally. Changing it toCOPY --from=builder-base
fixes the issue, but then layer caching quits working.
Here is my Dockerfile
...
# creating a python base with shared environment variables
FROM python:3.9-slim as python-base
ENV PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1 \
PIP_NO_CACHE_DIR=off \
PIP_DISABLE_PIP_VERSION_CHECK=on \
PIP_DEFAULT_TIMEOUT=100 \
POETRY_HOME="/opt/poetry" \
POETRY_VIRTUALENVS_IN_PROJECT=true \
POETRY_NO_INTERACTION=1 \
PYSETUP_PATH="/opt/pysetup" \
VENV_PATH="/opt/pysetup/.venv"
ENV PATH="$POETRY_HOME/bin:$VENV_PATH/bin:$PATH"
# triggering a recache in the pipelines
# builder-base is used to build dependencies
FROM python-base as builder-base
RUN apt-get update \
&& apt-get install --no-install-recommends -y \
curl \
build-essential
# Install Poetry - respects $POETRY_VERSION & $POETRY_HOME
ENV POETRY_VERSION=1.4.1 GET_POETRY_IGNORE_DEPRECATION=1
RUN curl -sSL https://install.python-poetry.org | python3 -
# We copy our Python requirements here to cache them
# and install on ly runtime deps using poetry
WORKDIR $PYSETUP_PATH
COPY ./poetry.lock ./pyproject.toml ./
RUN poetry install --no-dev
# 'development' stage installs all dev deps and can be used to develop code.
# For example using docker-compose to mount local volume under /app
FROM python-base as development
# Copying poetry and venv into image
COPY --from=builder-base:latest $POETRY_HOME $POETRY_HOME
COPY --from=builder-base:latest $PYSETUP_PATH $PYSETUP_PATH
# Copying in our entrypoint
# COPY ./docker/docker-entrypoint.sh /docker-entrypoint.sh
RUN chmod +x . /opt/pysetup/.venv/bin/activate
# venv already has runtime deps installed we get a quicker install
WORKDIR $PYSETUP_PATH
RUN poetry install
WORKDIR /app
COPY . .
EXPOSE 5000 5672
CMD [ "python", "src/manage.py", "runserver", "0.0.0.0:5000"]
# 'unit-tests' stage runs our unit tests with unittest and coverage.
FROM development:latest AS unit-tests
ENV CI=True
RUN coverage run --omit='src/manage.py,src/config/*,*/.venv/*,*/*__init__.py,*/tests.py,*/admin.py' src/manage.py test src --tag=ut && \
coverage report
# # 'integration-tests' stage runs our integration tests with unittest and coverage.
FROM development:latest AS integration-tests
ENV CI=True
RUN coverage run --omit='src/manage.py,src/config/*,*/.venv/*,*/*__init__.py,*/tests.py,*/admin.py' src/manage.py test src --tag=it && \
coverage report
# 'production' stage uses the clean 'python-base' stage and copies
# in only our runtime deps that were installed in the 'builder-base'
FROM python-base as production
COPY --from=builder-base:latest $VENV_PATH $VENV_PATH
RUN chmod +x . /opt/pysetup/.venv/bin/activate
COPY ./src /app
WORKDIR /app
CMD ["gunicorn", "-b", ":5000", "--log-level", "info", "config.wsgi:application", "-t", "600", "-w", "4"]
In particular these lines...
...
# Copying poetry and venv into image
COPY --from=builder-base:latest $POETRY_HOME $POETRY_HOME
COPY --from=builder-base:latest $PYSETUP_PATH $PYSETUP_PATH
...
Again, changing it to the following fixes local dev, but breaks layer caching in the pipelines:
...
# Copying poetry and venv into image
COPY --from=builder-base $POETRY_HOME $POETRY_HOME
COPY --from=builder-base $PYSETUP_PATH $PYSETUP_PATH
...
The fix, I believe, would need to be in the Dockerfile
. Not quite sure how it can be setup to accommodate both use cases without making a Dockerfile.dev
which I'm trying to avoid. Suggestions?
Case where layer caching does not work:
FROM python:3.9-slim as python-base
ENV PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1 \
PIP_NO_CACHE_DIR=off \
PIP_DISABLE_PIP_VERSION_CHECK=on \
PIP_DEFAULT_TIMEOUT=100 \
POETRY_HOME="/opt/poetry" \
POETRY_VIRTUALENVS_IN_PROJECT=true \
POETRY_NO_INTERACTION=1 \
PYSETUP_PATH="/opt/pysetup" \
VENV_PATH="/opt/pysetup/.venv"
ENV PATH="$POETRY_HOME/bin:$VENV_PATH/bin:$PATH"
FROM python-base as builder-base
RUN apt-get update \
&& apt-get install --no-install-recommends -y \
curl \
build-essential
...
FROM python-base as development
# Copying poetry and venv into image
COPY --from=builder-base $POETRY_HOME $POETRY_HOME
COPY --from=builder-base $PYSETUP_PATH $PYSETUP_PATH
...
Here's my oversimplified pipeline.yaml
:
# Read in the base variable template
variables:
dockerfilePath: $(Build.SourcesDirectory)
cacheBuster: 1 #change as need to create a new cache
pool:
vmIMage: ubuntu-latest
stages:
- stage: BuildBase
jobs:
- job: BuildBase
steps:
- task: Cache@2
inputs:
key: 'docker | "$(Agent.OS)" | $(cacheBuster)'
path: $(Pipeline.Workspace)/docker
- task: Docker@2
inputs:
command: build
repository: builder-base
dockerfile: $(dockerFilePath)/api/docker/Dockerfile
buildContext: $(dockerFilePath)/api
arguments: |
--target builder-base
env:
DOCKER_BUILDKIT: 1
- bash: |
docker images
mkdir -p $(Pipeline.Workspace)/docker
docker save -o $(Pipeline.Workspace)/docker/api.tar builder-base
- stage: BuildDev
dependsOn:
- BuildBase
jobs:
- job: BuildBase
steps:
- task: Cache@2
displayName: Creating cache...
inputs:
key: 'docker | "$(Agent.OS)" | $(cacheBuster)'
path: $(Pipeline.Workspace)/docker
- script: |
ls -l $(Pipeline.Workspace)/docker/
docker load -i $(Pipeline.Workspace)/docker/api.tar
docker images
- task: Docker@2
inputs:
command: build
repository: development
dockerfile: $(dockerFilePath)/api/docker/Dockerfile
buildContext: $(dockerFilePath)/api
arguments: |
--cache-from builder-base
--target development
env:
DOCKER_BUILDKIT: 1
- Currently, when it gets to
docker save
it will fail because a tag wasn't specified in the preceding build task. - Due to it not being specified in that task, ADO just uses
$(Build.BuildId)
by default. - In order for
docker save
to work, it needs the tag so I'll justlatest
for simplicity.
...
- task: Docker@2
inputs:
command: build
repository: builder-base
dockerfile: $(dockerFilePath)/api/docker/Dockerfile
buildContext: $(dockerFilePath)/api
arguments: |
--target builder-base
tags: |
latest
env:
DOCKER_BUILDKIT: 1
- bash: |
docker images
mkdir -p $(Pipeline.Workspace)/docker
docker save -o $(Pipeline.Workspace)/docker/api.tar builder-base:latest
...
Now docker save
works and in the subsequent stage it is loaded:
Starting: CmdLine
==============================================================================
Task : Command line
Description : Run a command line script using Bash on Linux and macOS and cmd.exe on Windows
Version : 2.212.0
Author : Microsoft Corporation
Help : https://docs.microsoft.com/azure/devops/pipelines/tasks/utility/command-line
==============================================================================
Generating script.
========================== Starting Command Output ===========================
/usr/bin/bash --noprofile --norc /home/vsts/work/_temp/1efef7d5-af4c-4705-9d59-2d38371a3caa.sh
total 806740
-rw------- 1 vsts docker 826097152 Jun 21 21:42 api.tar
Loaded image: builder-base:latest
REPOSITORY TAG IMAGE ID CREATED SIZE
builder-base latest c5ae5e84153f About a minute ago 803MB
When it gets to building the development
image which depends on builder-base
, it sees the cache but ignores it and continues to the build the stage anyway.
#18 importing cache manifest from builder-base
#18 sha256:db249d01eb28b6ad07804991ed483e6c4b76b050c42dd6a446b98270f592ab8b
#18 DONE 0.0s
#8 [internal] load build context
#8 sha256:b04f2e0c5898df2da834f79948847ffaa3260437bebc394051e3aa07d9d323a8
#8 transferring context: 870.99kB 0.0s done
#8 DONE 0.1s
#4 [python-base 1/1] FROM docker.io/library/python:3.9-slim
#4 sha256:f876c6f14c8c365d299789228d8a0c38ac92e17ea62116c830f5b7c6bc684e47
#4 DONE 0.1s
#5 [builder-base 1/5] RUN apt-get update && apt-get install --no-install-recommends -y curl build-essential
#5 sha256:9c96647f9ffa22c5b2b8b587ece6818721296b3927cc3757c843a52f122598d3
Try any of the variations in the development
build stage and still the same results:
...
- task: Docker@2
inputs:
command: build
repository: development
dockerfile: $(dockerFilePath)/api/docker/Dockerfile
buildContext: $(dockerFilePath)/api
arguments: |
--cache-from builder-base:latest
--target development
env:
DOCKER_BUILDKIT: 1
...
...
- task: Docker@2
inputs:
command: build
repository: development
dockerfile: $(dockerFilePath)/api/docker/Dockerfile
buildContext: $(dockerFilePath)/api
arguments: |
--cache-from builder-base
--target development
tags: |
latest
env:
DOCKER_BUILDKIT: 1
...
Even tried docker rmi builder-base:latest
to get rid of the tags altogether, but that just removes the image since there are no other tags.
Again, the only way I've been able to find to get it to load the cached image is adding the tags to the multi-stage Dockerfile
which breaks dev.
If there is a fix, or if I have something not configured properly, I'd love to know. Otherwise, turning caching off for the time being seems to be the best option.