We are deploying a python based web application on a remote server. Recently the disk usage have been increasing rapidly, and i found out that /var/lib/docker/overlay2 folder is using 80GB.
Running docker system df
shows:
TYPE TOTAL ACTIVE SIZE RECLAIMABLE
Images 23 23 6.272GB 2.774GB (44%)
Containers 23 23 428.1MB 0B (0%)
Local Volumes 15 13 2.238GB 76.51kB (0%)
Build Cache 79 0 6.687GB 6.687GB
Which is definitely not 80GB.
During further investigation i found that there is one folder which appears a lot of times in the /var/lib/docker/overlay2/**/diff, which is a folder within the web app container /home/app/web
.
Running this command du -ch --max-depth=0 /var/lib/docker/165536.165536/overlay2/**/diff/home/app/web
shows
68G total
which i think i quite high.
I am using this Dockerfile for the build:
FROM python:3.11-alpine as builder
# set work directory
WORKDIR /usr/src/app
# install psycopg2 dependencies
RUN apk update \
&& apk add postgresql-dev gcc python3-dev musl-dev
# set environment variables
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
# copy project
COPY poetry.lock .
COPY pyproject.toml .
# install dependencies
RUN pip install --upgrade pip
RUN pip install poetry
RUN poetry export -f requirements.txt --without dev --output requirements.txt
RUN pip install -r requirements.txt
# TODO: det her bør være i faktisk prod
# RUN python -m pip_audit -r requirements.txt
RUN pip wheel --no-cache-dir --no-deps --wheel-dir /usr/src/app/wheels -r requirements.txt
RUN poetry version -s > version.txt
FROM python:3.11-alpine
RUN mkdir -p /home/app
# create the app user
RUN addgroup -S app && adduser -S app -G app
# create the appropriate directories
ENV HOME=/home/app
ENV APP_HOME=/home/app/web
RUN mkdir $APP_HOME
WORKDIR $APP_HOME
# install dependencies
RUN apk update && apk add libpq postgresql-client busybox-suid make
RUN rm -rf /var/cache/apk/*
COPY --from=builder /usr/src/app/wheels /wheels
COPY --from=builder /usr/src/app/requirements.txt .
RUN pip install --no-cache /wheels/*
# copy project
COPY . $APP_HOME
# copy version.txt, må override version.txt fra local
COPY --from=builder /usr/src/app/version.txt .
#make entrypoint executable
RUN sed -i 's/\r$//g' $APP_HOME/entrypoint.prod.sh
RUN chmod +x $APP_HOME/entrypoint.prod.sh
RUN mv ${APP_HOME}/entrypoint.web.prod.sh /
RUN sed -i 's/\r$//g' /entrypoint.web.prod.sh
RUN chmod +x /entrypoint.web.prod.sh
#create backup directory
RUN mkdir $APP_HOME/backup
RUN mkdir -p $APP_HOME/cissky/staticfiles
RUN mkdir -p $APP_HOME/cissky/mediafiles
# make documentation
# TODO: uncomment det her i faktisk produksjon
# RUN cd docs && make clean && make html && cd ..
RUN mkdir docs/html
# kjorer makemigrations i build i stedet
RUN python3 cissky/manage.py makemigrations --noinput
# chown all the files to the app user
RUN chown -R app:app $APP_HOME
# change to the app user
USER app
# run entrypoint.prod.sh
ENTRYPOINT ["/home/app/web/entrypoint.prod.sh"]
Im guessing im doing something wrong with the build. Could anyone please give me some pointers?
I know its not the logs which are the issue. Ive also tried docker system prune --all --volumes
, but the size will not be reduced.
Since I cannot reproduce your problem, this is not a proven solution, just a little collection of hints.
As already noted in the comment above, the
RUN chown
command is not efficient because rewrites in a new layer all the files that it touches.Also the
RUN apk update
is not efficient: it writes the apk cache on the filesystem, then you delete it writing a new layer but this deletion is only logical, not phisical. The correct solution iswithout the subsequent
RUN rm -rf /var/cache/apk/*
(see Alpine Dockerfile advantages of --no-cache vs. rm /var/cache/apk/*).To find wasted space in images you can use https://github.com/wagoodman/dive, however the total size of your images is 6 GB, so your problem is elsewhere.
Perhaps
docker system prune
may help, or the more drasticdocker system prune -a
, but it will delete a lot of things so read the doc before using it!In one case I solved a problem of wasted space with it; note that it reported "Total reclaimed space: 4.332GB" but the real gain was 13 GB.