Understanding the features of the official Python Docker image

The official Python Docker image is quite popular. By the way, I myself have recommended one of its variations as a base image. But many programmers don't quite understand exactly how it works. This can lead to confusion and various problems. In this post, I'm going to talk about how this image was created, how it can be useful, its correct use and its limitations. Specifically, I'll break it down here (in the state represented by the Dockerfile dated Aug 19, 2020 ) and dwell on the most important details along the way.







python:3.8-slim-buster



Reading the Dockerfile



▍Basic image



Let's start with a base image:



FROM debian:buster-slim


It turns out that the base image python:3.8-slim-busteris Debian GNU / Linux 10, the current stable release of Debian, also known as Buster (Debian releases are named after characters from Toy Story). Buster is, if anyone is interested, Andy's dog.



So, at the heart of the image we are interested in is the Linux distribution, which guarantees its stable operation. Bug fixes are periodically released for this distribution. The variant has slimfewer packages installed than the regular variant. There, for example, there are no compilers.



▍Environment Variables



Now let's take a look at environment variables. The first ensures that it is added /usr/local/binas early as possible to $PATH.



#     python,   ,    
ENV PATH /usr/local/bin:$PATH


The image is designed so that Python is installed in /usr/local. As a result, this construct ensures that the installed executables will be used by default.



Next, let's take a look at the language settings:



# http://bugs.python.org/issue19846
# >     "LANG=C"  Linux *    Python 3*,   .
ENV LANG C.UTF-8


As far as I know, modern Python 3, by default, and without this setting, uses UTF-8. So I'm not sure if this line is needed in the Dockerfile in question these days.



There is also an environment variable that contains information about the current Python version:



ENV PYTHON_VERSION 3.8.5


There is also a GPG-keyed environment variable in the Dockerfile, which is used to verify the loaded Python source code.



▍Runtime dependencies



Python needs some additional packages to work:



RUN apt-get update && apt-get install -y --no-install-recommends \
    ca-certificates \
    netbase \
  && rm -rf /var/lib/apt/lists/*


The first package,, ca-certificatescontains a list of standard CA certificates. Something similar is used by the browser to validate -addresses . This allows Python wgetand other tools to validate the certificates provided by the servers.



The second package, netbasewhich does the installation in /etcseveral files, is necessary to configure the mapping of certain names to certain ports and protocols. For example, it /etc/servicesis responsible for configuring the correspondence of service names, like https, with port numbers. In this case, it is 443/tcp.



▍Install Python



The compilation toolkit is now installed. Namely, the Python source is downloaded and compiled, and then unnecessary Debian packages are uninstalled:



RUN set -ex \
  \
  && savedAptMark="$(apt-mark showmanual)" \
  && apt-get update && apt-get install -y --no-install-recommends \
    dpkg-dev \
    gcc \
    libbluetooth-dev \
    libbz2-dev \
    libc6-dev \
    libexpat1-dev \
    libffi-dev \
    libgdbm-dev \
    liblzma-dev \
    libncursesw5-dev \
    libreadline-dev \
    libsqlite3-dev \
    libssl-dev \
    make \
    tk-dev \
    uuid-dev \
    wget \
    xz-utils \
    zlib1g-dev \
#   Stretch "gpg"       
    $(command -v gpg > /dev/null || echo 'gnupg dirmngr') \
  \
  && wget -O python.tar.xz "https://www.python.org/ftp/python/${PYTHON_VERSION%%[a-z]*}/Python-$PYTHON_VERSION.tar.xz" \
  && wget -O python.tar.xz.asc "https://www.python.org/ftp/python/${PYTHON_VERSION%%[a-z]*}/Python-$PYTHON_VERSION.tar.xz.asc" \
  && export GNUPGHOME="$(mktemp -d)" \
  && gpg --batch --keyserver ha.pool.sks-keyservers.net --recv-keys "$GPG_KEY" \
  && gpg --batch --verify python.tar.xz.asc python.tar.xz \
  && { command -v gpgconf > /dev/null && gpgconf --kill all || :; } \
  && rm -rf "$GNUPGHOME" python.tar.xz.asc \
  && mkdir -p /usr/src/python \
  && tar -xJC /usr/src/python --strip-components=1 -f python.tar.xz \
  && rm python.tar.xz \
  \
  && cd /usr/src/python \
  && gnuArch="$(dpkg-architecture --query DEB_BUILD_GNU_TYPE)" \
  && ./configure \
    --build="$gnuArch" \
    --enable-loadable-sqlite-extensions \
    --enable-optimizations \
    --enable-option-checking=fatal \
    --enable-shared \
    --with-system-expat \
    --with-system-ffi \
    --without-ensurepip \
  && make -j "$(nproc)" \
    LDFLAGS="-Wl,--strip-all" \
  && make install \
  && rm -rf /usr/src/python \
  \
  && find /usr/local -depth \
    \( \
      \( -type d -a \( -name test -o -name tests -o -name idle_test \) \) \
      -o \( -type f -a \( -name '*.pyc' -o -name '*.pyo' -o -name '*.a' \) \) \
      -o \( -type f -a -name 'wininst-*.exe' \) \
    \) -exec rm -rf '{}' + \
  \
  && ldconfig \
  \
  && apt-mark auto '.*' > /dev/null \
  && apt-mark manual $savedAptMark \
  && find /usr/local -type f -executable -not \( -name '*tkinter*' \) -exec ldd '{}' ';' \
    | awk '/=>/ { print $(NF-1) }' \
    | sort -u \
    | xargs -r dpkg-query --search \
    | cut -d: -f1 \
    | sort -u \
    | xargs -r apt-mark manual \
  && apt-get purge -y --auto-remove -o APT::AutoRemove::RecommendsImportant=false \
  && rm -rf /var/lib/apt/lists/* \
  \
  && python3 --version


There is a lot going on here, but the most important thing is this:



  1. Python is installed in /usr/local.
  2. All .pyc files are removed.
  3. Packages, in particular - gccand others of the kind that were needed to compile Python, are removed after they are no longer needed.


Due to the fact that all this happens in a single command RUN, as a result, the compiler is not saved in any of the layers, which helps to maintain a compact image size.



Here you can note that Python needs a library to compile libbluetooth-dev. This seemed unusual to me, so I decided to figure it out. As it turns out, Python can create Bluetooth sockets, but only if compiled using this library.



▍Symbolic link setup



The next step /usr/local/bin/python3is to assign a symbolic link /usr/local/bin/python, which allows Python to be invoked in different ways:



#     ,     
RUN cd /usr/local/bin \
  && ln -s idle3 idle \
  && ln -s pydoc3 pydoc \
  && ln -s python3 python \
  && ln -s python3-config python-config


▍Install pip



The package manager piphas its own release schedule that differs from the Python release schedule. For example, this Dockerfile installs Python 3.8.5, released in July 2020. And pip 20.2.2 was released in August, after Python was released, but the Dockerfile is designed to have a fresh version installed pip:



#     "PIP_VERSION",  pip  : "ValueError: invalid truth value '<VERSION>'"
ENV PYTHON_PIP_VERSION 20.2.2
# https://github.com/pypa/get-pip
ENV PYTHON_GET_PIP_URL https://github.com/pypa/get-pip/raw/5578af97f8b2b466f4cdbebe18a3ba2d48ad1434/get-pip.py
ENV PYTHON_GET_PIP_SHA256 d4d62a0850fe0c2e6325b2cc20d818c580563de5a2038f917e3cb0e25280b4d1

RUN set -ex; \
  \
  savedAptMark="$(apt-mark showmanual)"; \
  apt-get update; \
  apt-get install -y --no-install-recommends wget; \
  \
  wget -O get-pip.py "$PYTHON_GET_PIP_URL"; \
  echo "$PYTHON_GET_PIP_SHA256 *get-pip.py" | sha256sum --check --strict -; \
  \
  apt-mark auto '.*' > /dev/null; \
  [ -z "$savedAptMark" ] || apt-mark manual $savedAptMark; \
  apt-get purge -y --auto-remove -o APT::AutoRemove::RecommendsImportant=false; \
  rm -rf /var/lib/apt/lists/*; \
  \
  python get-pip.py \
    --disable-pip-version-check \
    --no-cache-dir \
    "pip==$PYTHON_PIP_VERSION" \
  ; \
  pip --version; \
  \
  find /usr/local -depth \
    \( \
      \( -type d -a \( -name test -o -name tests -o -name idle_test \) \) \
      -o \
      \( -type f -a \( -name '*.pyc' -o -name '*.pyo' \) \) \
    \) -exec rm -rf '{}' +; \
  rm -f get-pip.py


After completing these operations, as before, all .pyc files are deleted.



▍Image entry point



As a result, the entry point to the image is specified in the Dockerfile:



CMD ["python3"]


Using CMDinstead, ENTRYPOINTwe launch the image, by default we get access to python:



$ docker run -it python:3.8-slim-buster
Python 3.8.5 (default, Aug  4 2020, 16:24:08)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>


But, if necessary, you can specify other executable files when starting the image:



$ docker run -it python:3.8-slim-buster bash
root@280c9b73e8f9:/#


Outcome



Here's what we learned by parsing the official Python image's Dockerfile slim-buster.



▍The image includes Python



While this may seem obvious, it is worth paying attention to exactly how Python is included in the image. Namely, this is done by installing it in /usr/local.



Programmers using this image sometimes make the same mistake, which is to reinstall the Debian version of Python:



FROM python:3.8-slim-buster

#    :
RUN apt-get update && apt-get install python3-dev


When you run this command, RUNPython will be installed again, but in /usr, not in /usr/local. And this will usually not be the version of Python that is installed in /usr/local. And the programmer who used the above Docker file probably doesn't need two different versions of Python in the same image. This is mainly the reason for the confusion.



And if someone really needs a Debian version of Python, then it would be better to use it as a base image debian:buster-slim.



▍ The image includes the latest version of pip



For example, the most recent Python 3.5 release was in November 2019, but the Docker image python:3.5-slim-busterincludes the pipone that came out in August 2020. This is (usually) good, as it means we have the latest bug fixes and performance improvements. It also means that we can benefit from support for newer wheel options.



▍ All .pyc files are removed from the image



If you want to speed up the system boot up a little, you can independently compile the source code of the standard library into the .pyc format. This is done using the compileall module .



▍Image does not install Debian security updates



Although the base images debian:buster-slim, and pythonare updated frequently, there is a certain gap between the release of the Debian security updates and turning them into images. Therefore, you need to independently install security updates for the base Linux distribution.



What Docker images do you use to execute Python code?






All Articles