What is Poetry

Poetry is a modern Python packaging and dependency management tool, serving as a replacement for pip .

If you’re familiar with Rust and its package manager, Cargo , you will likely find Poetry easy to understand.

The main difference between Poetry and pip lies in the lock file (poetry.lock), which contains all project dependency data (including recursive dependencies) and their hashes. This lock file ensures the reproducibility of your builds (as long as the used dependency exists). Another significant advantage is speed; Poetry installs dependencies using a lock file much faster than pip.

What Makes a Docker Image Perfect?

In my view, the perfect Docker image should be:

  1. Lightweight
  2. Free of build-time dependencies
  3. Operable without root privileges
  4. Minimal in attack surface (see points 2 and 3)
  5. Incapable of writing to the container’s file system at runtime
  6. Optimized for Docker layer caching to speed up rebuilds

Building a Docker Image with Poetry

Project Structure

I have prepared a simple CLI hello-world project that uses Poetry as the package manager. The project has the following structure:

.
├── hello_world
│   ├── __init__.py
│   └── main.py
├── .dockerignore
├── .gitignore
├── Dockerfile
├── LICENSE
├── poetry.lock
└── pyproject.toml

The pyproject.toml file is straightforward and was fully generated by the poetry init command:

[tool.poetry]
name = "poetry-docker-demo"
version = "0.1.0"
description = "Poetry Docker demo for codemageddon.me blog"
authors = ["Codemageddon <[email protected]>"]
license = "MIT"
readme = "README.md"

[tool.poetry.dependencies]
python = "^3.12"
click = "^8.1.7"

[tool.poetry.group.dev.dependencies]
ruff = "^0.5.1"

[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"

A Simple Approach

To build a ready-to-work image, we need to install all the project dependencies, copy the project sources, and define the entrypoint. Let’s start with a basic solution:

FROM python:3.12-bookworm
RUN pip install poetry
WORKDIR /usr/src/app
COPY . .
RUN poetry install --no-root
ENTRYPOINT ["poetry", "run", "python", "hello_world/main.py"]

Here’s what we do:

  1. Install Poetry into the image (we need it to install requirements).
  2. Define the working directory as /usr/src/app.
  3. Copy the entire project directory into the working directory.
  4. Install all the project dependencies using poetry install --no-root (the --no-root flag tells Poetry not to install the project itself as a library into the environment).
  5. Set the entrypoint to poetry run python hello_world/main.py (since we already have Poetry in the image, why not use it to run the app?!).
$ docker build . -t poetry-docker-demo:simple

Check if it works:

$ docker run -it --rm poetry-docker-demo:simple
Hello, World!

It seems okay.

Now, let’s check the image size:

$ docker images poetry-docker-demo:simple

REPOSITORY           TAG       IMAGE ID       CREATED          SIZE
poetry-docker-demo   simple    15d9632229ea   28 minutes ago   1.15GB

That’s a pretty large image for a simple application, isn’t it? So, we need to delve deeper.

Use a Slim Base Image

First, let’s check the base image size.

$ docker images python:3.12-bookworm

REPOSITORY   TAG             IMAGE ID       CREATED      SIZE
python       3.12-bookworm   fac1b1854f3f   4 days ago   1.02GB

The base image is quite large. Let’s try the slim version.

FROM python:3.12-slim-bookworm
RUN pip install poetry
WORKDIR /usr/src/app
COPY . .
RUN poetry install --no-root
ENTRYPOINT ["poetry", "run", "python", "hello_world/main.py"]

We’ve only changed the base image from python:3.12-bookworm to python:3.12-slim-bookworm.

$ docker build . -t poetry-docker-demo:simple-slim
$ docker images poetry-docker-demo:simple-slim
REPOSITORY           TAG           IMAGE ID       CREATED          SIZE
poetry-docker-demo   simple-slim   f2589245e1ea   27 minutes ago   289MB

Much better, isn’t it?

But there are still some issues to address.

Exclude Build-Time Dependencies

First, we have poetry and ruff installed in the image, which are not required at runtime:

Globally installed packages:
$ docker run -it --rm --entrypoint pip poetry-docker-demo:simple-slim freeze
build==1.2.1
CacheControl==0.14.0
certifi==2024.7.4
cffi==1.16.0
charset-normalizer==3.3.2
cleo==2.1.0
crashtest==0.4.1
cryptography==42.0.8
distlib==0.3.8
dulwich==0.21.7
fastjsonschema==2.20.0
filelock==3.15.4
idna==3.7
installer==0.7.0
jaraco.classes==3.4.0
jeepney==0.8.0
keyring==24.3.1
more-itertools==10.3.0
msgpack==1.0.8
packaging==24.1
pexpect==4.9.0
pkginfo==1.11.1
platformdirs==4.2.2
poetry==1.8.3
poetry-core==1.9.0
poetry-plugin-export==1.8.0
ptyprocess==0.7.0
pycparser==2.22
pyproject_hooks==1.1.0
rapidfuzz==3.9.4
requests==2.32.3
requests-toolbelt==1.0.0
SecretStorage==3.3.3
setuptools==70.3.0
shellingham==1.5.4
tomlkit==0.13.0
trove-classifiers==2024.7.2
urllib3==2.2.2
virtualenv==20.26.3
wheel==0.43.0
Project dependencies:
$ docker run -it --rm --entrypoint poetry poetry-docker-demo:simple-slim run python -m pip freeze
click==8.1.7
ruff==0.5.1

The second issue is that the slim image doesn’t contain any development packages, so we cannot install any native dependencies that don’t have a pre-built wheel for the required platform. Let’s use a multi-stage Docker build to address these issues:

FROM python:3.12-bookworm AS build
RUN pip install poetry
WORKDIR /usr/src/app
ENV POETRY_VIRTUALENVS_CREATE=true \
    POETRY_VIRTUALENVS_IN_PROJECT=true
COPY . .
RUN poetry install --no-root --only main

FROM python:3.12-slim-bookworm AS runtime
ENV PATH="/usr/src/app/.venv/bin:${PATH}"
COPY --from=build /usr/src/app /usr/src/app
WORKDIR /usr/src/app
ENTRYPOINT ["python", "hello_world/main.py"]

What we do here:

  1. Use separate images for build and runtime.
  2. The build image is made from the full (not slim) image, so it contains lots of development tools out of the box.
  3. In the build image, we install dependencies via Poetry in a virtual environment:
    • POETRY_VIRTUALENVS_CREATE=true instructs Poetry to create a virtual environment for the project and install the dependencies into it instead of the global environment.
    • POETRY_VIRTUALENVS_IN_PROJECT=true instructs Poetry to create the virtual environment directory inside the project root directory (in our case, it will be /usr/src/app/.venv).
    • --only main instructs Poetry to install only the main dependencies (no extras, no dev dependencies, etc.).
  4. The runtime image is based on the slim version of the image, making it lightweight.
  5. We extend the PATH environment variable so that the Python binary (as well as dependencies) will be found in our virtual environment.
  6. Copy the directory with the sources and virtual environment from the build container to the runtime.
  7. Since we don’t have Poetry in the runtime image anymore, we need to run the application directly with python hello_world/main.py and instruct the OS to use our virtual environment instead of the global environment (see point 5).

Let’s check the image size:

$ docker images poetry-docker-demo:multi-stage
REPOSITORY           TAG           IMAGE ID       CREATED          SIZE
poetry-docker-demo   multi-stage   1daa2842567b   20 minutes ago   161MB

We’ve saved more than 120MB!

Check the dependencies installed in the image:

Globally installed packages:
$ docker run -it --rm --entrypoint /usr/local/bin/python poetry-docker-demo:multi-stage -m pip freeze
setuptools==70.3.0
wheel==0.43.0
Project dependencies:
$ docker run -it --rm --entrypoint pip poetry-docker-demo:multi-stage freeze
click==8.1.7

We have only click, which is defined as a main dependency, with no poetry or other dev dependencies.

Run as Non-Root User

And what about security? If someone were to exploit our application (though the example application is not vulnerable outside of the host, it’s just an example), they would at least have root privileges inside the container and potentially on the host system as well. To enhance security, the application will run under a separate, non-root user.

FROM python:3.12-bookworm AS build
RUN pip install poetry
WORKDIR /usr/src/app
ENV POETRY_VIRTUALENVS_CREATE=true \
    POETRY_VIRTUALENVS_IN_PROJECT=true
COPY . .
RUN poetry install --no-root --only main

FROM python:3.12-slim-bookworm AS runtime
ENV PATH="/usr/src/app/.venv/bin:${PATH}"
COPY --from=build /usr/src/app /usr/src/app
RUN useradd -U -M -d /nonexistent app
USER app
WORKDIR /usr/src/app
ENTRYPOINT ["python", "hello_world/main.py"]

The only change here is that we added the user app and use it to run the application.

Maximize Docker Cache Utilization

We have significantly improved our Docker image’s size and security. Now, let’s consider build time. Docker has excellent layer caching functionality, allowing us not to rebuild the entire image every time we run the docker build command. It only runs commands placed after the first changed layer. Currently, we provoke Docker to rebuild almost everything with each code change. Ideally, we don’t want to install dependencies or create the user with every build. It’s a good idea to include the app sources in the final stages and use only the poetry.lock and pyproject.toml files to install dependencies. These files don’t change as frequently as the source code, so we’ll likely do less work on each build.

FROM python:3.12-bookworm AS build
RUN pip install poetry==1.8.3
WORKDIR /usr/src/app
ENV POETRY_VIRTUALENVS_CREATE=true \
    POETRY_VIRTUALENVS_IN_PROJECT=true \
    POETRY_CACHE_DIR=/tmp/poetry_cache
COPY pyproject.toml poetry.lock ./
RUN --mount=type=cache,target=$POETRY_CACHE_DIR poetry install --no-root --only main

FROM python:3.12-slim-bookworm AS runtime
ENV PATH="/usr/src/app/.venv/bin:${PATH}"
COPY --from=build /usr/src/app/.venv /usr/src/app/.venv
RUN useradd -U -M -d /nonexistent app
USER app
WORKDIR /usr/src/app
COPY hello_world ./hello_world
ENTRYPOINT ["python", "hello_world/main.py"]

What we do here:

  1. Copy only the pyproject.toml and poetry.lock into the build image.
  2. Pin the poetry version to avoid implicit updates. This prevents the cache from being invalidated and protects us against possible compatibility breaks in future poetry versions.
  3. One more hack is used here: with POETRY_CACHE_DIR=/tmp/poetry_cache, we instruct Poetry to store the package cache in the specified directory. Then, we instruct docker to use the directory as a cache (RUN --mount=type=cache,target=$POETRY_CACHE_DIR ...), so on subsequent builds, Poetry will not download all dependencies again, just the new and updated ones (which are not in the cache yet).
  4. Copy the sources separately into the runtime image (COPY hello_world ./hello_world). The copying must be as close to the end as possible.

Build Time Comparison

Methodology

For each approach I fully cleaned up the Docker cache with

$ docker rmi -f $(docker images -qa)
$ docker image prune -af
$ docker system prune -af  --volumes

After this I calculated the time for the “cold” cache, then for the “warm” cache (see the main branch of the example project). Then I put a minor changes to the source code (just a comment in the main.py file) and measure the build time after the source changed. Then a main dependency has beed added (I add pyyaml package using poetry add "pyyaml==6.0.1"). And final measurements have been made after a dev dependency has been added (I add the mypy package using poetry add -G dev mypy)

For each approach, I ran the scenario 10 times

Results

The table below presents the performance and size comparisons for different Docker image-building approaches discussed in this guide. Each approach was tested under various conditions to assess its impact on build time and image size:

  • Cold Cache: Measures the build time when no Docker cache is available. This simulates the first-time build scenario where all layers are built from scratch.

  • Warm Cache - Without Changes: Shows the build time when the Docker cache is fully utilized, with no changes made to the source code or dependencies. This scenario demonstrates the ideal rebuild time when nothing has changed in the project.

  • Warm Cache - Sources Changed: Reflects the build time when minor changes are made to the source code, such as a comment update. This measures how efficiently Docker can reuse cached layers when only the application code changes.

  • Warm Cache - Main Requirements Changed: Represents the build time when a main dependency is updated or added. This scenario evaluates how efficiently the image rebuilds when package dependencies are modified.

  • Warm Cache - Dev Requirements Changed: Indicates the build time when a development dependency is added or updated. This tests the efficiency of Docker builds when development dependencies change, which should ideally not affect the final production image.

  • Image Size: Shows the final image size for each approach, highlighting how the different optimization techniques impact the image footprint.

  • Cells with green highlights indicate the best performance or smallest image size, suggesting optimal scenarios.

  • Cells with red highlights indicate suboptimal performance or larger image sizes, pointing to areas for potential improvement.

ApproachCold cache
(seconds)
Warm cacheImage Size
Without changes
(seconds)
Sources changed
(seconds)
Main requirements changed
(seconds)
Dev requirement changed
(seconds)
01-simple20.61 (19.467 - 22.49)0.4535 (0.42 - 0.76)1.6415 (1.573 - 1.961)1.628 (1.577 - 1.696)1.817 (1.763 - 2.284)1.15GB
02-simple-slim10.413 (9.971 - 11.82)0.437 (0.414 - 0.675)1.62 (1.584 - 2.463)1.623 (1.588 - 1.795)1.836 (1.742 - 1.952)289MB
03-multi-stage21.066 (19.574 - 22.684)0.665 (0.632 - 0.805)1.268 (1.226 - 1.319)1.311 (1.285 - 1.398)1.353 (1.331 - 1.406)161MB
04-no-root21.171 (19.413 - 22.053)0.667 (0.601 - 0.748)1.391 (1.36 - 1.563)1.446 (1.41 - 1.788)1.460 (1.431 - 1.487)161MB
05-final20.587 (19.468 - 21.775)0.688 (0.656 - 0.826)0.458 (0.441 - 0.541)1.448 (1.406 - 1.536)1.319 (1.277 - 1.383)161MB
06-final-slim-build10.243 (9.934 - 10.68)0.643 (0.435 - 0.828)0.459 (0.424 - 0.482)1.437 (1.422 - 1.491)1.307 (1.282 - 1.669)161MB

Detailed measurements you can find in the csv-file

Ways for Extra Optimizations

  1. First of all, you are free to use the slim image for the build stage as well (see the 06-final-slim-build approach in the table below). But you can do it just in case if you don’t need to build any native dependencies during the build stage, so basically I’d rather recommend to use the full image for builds - it’s really cheap because of the docker caching mechanism.
  2. The second even more hard core option is to use alpine-based image, but there are some possible issues with it, so do it on your own risk.
  3. Keep dev-dependencies in a separate group (tool.poetry.group.dev.dependencies by default)
  4. Remove dependencies (from the tool.poetry.dependencies section) that are not used in your project any more.
  5. Use pre-built build image that already contain the poetry installed and configured.
  6. The last but not least - control the build context size. Keep your .dockerignore file up to date to avoid different build/test/run and other artifacts (that are not required in the image) to be passed to the build context. Also, wrongly-configured .dockerignore file may lead to a security problems, for example .

Conclusion

In this guide, we’ve explored a range of strategies to optimize Docker images for Poetry-based Python projects. By implementing techniques such as using slim base images, multi-stage builds, excluding unnecessary build-time dependencies, running as a non-root user, and maximizing Docker cache utilization, we have significantly reduced image size, enhanced security, and improved build times.

These optimizations not only result in lighter and faster Docker images but also contribute to a more secure and maintainable environment, which is especially important in production settings where performance and security are critical.

Adopting these best practices ensures that your Docker images are efficient, secure, and resilient against common pitfalls. These techniques can also be adapted for other modern package management tools like pdm or uv , making them versatile solutions for various development workflows.

As a next step, consider experimenting with further optimizations, such as using Alpine-based images or pre-built base images that include pre-configured tools. Continuously refining your Docker images will keep your projects efficient, secure, and ready for any production challenges.