I thought damm a lot of users must have uploaded content only to find out that the docker images were taking up all the space,
This week I dived into the fascinating world of multi-stage docker build
In https://bfportal.gg i use a relatively simple stack,
Then one day I woke up to an alert on #datadog that almost all of the disk space on the VPS has been used,
so I did an image prune and thought the size of the images must be quite big to fill up all the space
lo and behold it was 2 gigs for each image 💀 ( no wonder my poor VPS was all full )
so I went on looking for how to reduce the size of the image.
it was docker multi-stage builds,
At this point i had this good ol’ dockerfile ( Click to expand )
# Use an official Python runtime based on Debian 10 "buster" as a parent image.
FROM python:3.11-slim-buster as dev
# Add user that will be used in the container.
RUN useradd --create-home wagtail
# Port used by this container to serve HTTP.
# Set environment variables.
# 1. Force Python stdout and stderr streams to be unbuffered.
# 2. Set PORT variable that is used by Gunicorn. This should match "EXPOSE"
ENV PYTHONUNBUFFERED=1 \
# Install system packages required by Wagtail and Django.
RUN apt-get update --yes --quiet && apt-get install --yes --quiet --no-install-recommends \
&& rm -rf /var/lib/apt/lists/* /usr/share/doc /usr/share/man \
&& apt-get clean
RUN npm install [email protected] -g && \
npm install n -g && \
# Install the application server.
RUN pip install "gunicorn==20.0.4"
# Install the project requirements.
COPY requirements.txt /
RUN pip install -r /requirements.txt
# Use /app folder as a directory where the source code is stored.
# Set this directory to be owned by the "wagtail" user. This Wagtail project
# uses SQLite, the folder needs to be owned by the user that
# will be writing to the database file.
RUN chown wagtail:wagtail /app
# Copy the source code of the project into the container.
COPY --chown=wagtail:wagtail . .
# Use user "wagtail" to run the build commands below and the server itself.
FROM dev as final
RUN python manage.py tailwind install --no-input;
RUN python manage.py tailwind build --no-input
RUN python manage.py collectstatic --noinput --clear
Ugly rht ? 🥲
Its basically this
So i started with separating the tools from dependencies, there were two clear things that i can separate,
- Python Virtual Environment
- Node binary
So i made two separate images for each
In this project i now use
poetry, so if you want a reliable way to install poetry in a docker image, here you go
gave it the name
builder so that i can later take advantage of
COPY --from=, next up was
( PS: why is it so hard to install node in debian images ? making a separate image is sooo much easier )
Now that we have our base images ready we can take advantage of
The docker image is then split into two parts :
- Dev is for local development and contains npm and tailwind binaries
- Final is for production an i remove the
node_modulesfolder from it to save space
( Previously we only had final :x )
the important lines are
46 - 51 , in these line we take full advantage by copying only the things we need 😁.
then we finally move onto final stage
Now by using
docker buildx build --target=final ) one day and a few hours of head-scratching later I was able to bring the size of the image down to just 600MB 😮 .
To my understanding we did this
now all my CI/CD pipelines for bfportal.gg
are super fast
( it was around 7 mins before, now it’s < 3 mins ) all thanks to docker ❤️
- Make use of python-alpine image to have even smaller final image size
- But Python=>Speed recommends against it
- Find better way to write file so that we can have less layers and more caching
- Must use
buildxfor faster build times and better cache
- Copy only the final compiled tools from a base image
- MUST REMOVE
node_modulesIN THE END
All in all I find docker to be a awesome technology :) , Have a good day
PS : Let me know if you have any ideas how to make it better :), you can find me at discord or twitter (gala_vs)