Matt's Musings

What?

Base system (BusyBox, shared libraries, and toolchain): 86.08 MB
Runtime system (BusyBox and shared libraries, no toolchain): 1.965 MB
Static runtime system (just BusyBox, no shared libraries or toolchain): 345.8 kB
nginx (with OpenSSL): 3.034 MB
PostgreSQL: 15.82 MB

What, exactly?

These are minimal images for Docker, based on musl libc and BusyBox. They don't use a conventional GNU/Linux distribution or even a package manager. Instead, they are built entirely from source. Their top-level source trees bring in components through Git submodules. Their build processes owe a lot to Linux From Scratch, Dragora (the upcoming musl-based version), and Sabotage Linux.

Why?

Why bother building images this way? After all, Solomon Hykes has said that Docker isn't intended to replace existing tools, such as package managers and their accompanying repositories, but to complement them. Sure, the typical base images have a hundred megabytes or more of stuff that most containers don't need at run time. But they don't waste much memory at run time, and disk space is cheap, especially on the scale of mere hundreds of megabytes. Better to just accept the waste, which isn't really waste, since it helps us developers get stuff done, right? Most Docker users don't seem to care how much unneeded stuff is in their images, so why should I?

It might be good enough to answer that I'm doing this because I want to. After all, I'm doing it in my spare time; nobody is paying me to do it. Still, it's reasonable to ask why, of all the things I could do in my spare time, I would bother with this. So I'll explain my reasons.

First, part of the philosophy of Docker, as I understand it, is that the whole userland software stack for an application -- everything above the kernel -- should be the developer's responsibility. As Mr. Hykes said in the previously referenced talk, "the system is part of the application. Your choice of distribution, your choice of system libraries, all of those choices, even if you didn't make them consciously -- maybe you just used the system that was lying around there -- that is affecting the behavior of the application, and if you swap it out, things will change." So if all of that, including things like libc and OpenSSL, will be my responsibility as a developer, I want to minimize that responsibility in any way I can.

In light of that, let's take a look at some of the bloat that we get in the widely used Docker images which are based on popular GNU/Linux distributions. Here are some packages that we get in the debian:wheezy image whether we really need them or not:

ext2 filesystem utilities: We don't mount or fsck inside a container.
SysV init and init scripts: We don't run init inside a Docker container.
Pluggable Authentication Modules: Plain /etc/passwd is enough inside a typical container.
Berkeley DB: This is used by a PAM module we don't need.
SELinux libraries and utilities: This might be useful on the host system, but it's rarely if ever used inside a container.
libusb: Inside a container we're not working directly with hardware, USB or otherwise.
ncurses, terminfo, and readline: These are nice to have in interactive sessions, but not needed in a production container.
bash: This is about 10 times as much as we usually need in a shell.
ping and other network utilities
OpenSSL: The only thing in the base system that needs this is ping6.

Granted, a lot of this isn't even loaded into memory when running a typical Debian-based Docker image. But it's there, on disk at least. As another example, let's look at some of the libraries referenced by the main postgres process in the Orchard PostgreSQL image:

libxml2: This is used by some XML-related features that most PostgreSQL users will never need.
PAM: All connections to a PostgreSQL database inside a container are over the network, so this is never needed.
Kerberos, GSSAPI, and LDAP: Most of us aren't using a Docker container like this in a legacy corporate network.
SQLite: It's amusing to me that this build of PostgreSQL depends on SQLite; in fact, this dependency is courtesy of the main Kerberos library.

None of the above is necessary for a working build of PostgreSQL, as my image demonstrates.

Does any of this matter? Maybe not; I might be obsessing over these gratuitous dependencies for no good reason. Still, I felt that this one-size-fits-all approach to packaging was leading to some undesirable waste, and decided to try to eliminate that waste by choosing different tradeoffs.

In particular, I believe glibc has a lot of cruft that isn't necessary in many environments, including most Docker containers. For example:

Name Service Switch: Did you even know this exists? A GNU/Linux system uses it whenever you log in or do a DNS lookup.
iconv implementation: This implementation of conversion between character encodings is overkill for most modern applications. It adds several megabytes of shared libraries to the base system, and requires a locale definition file just to handle UTF-8 correctly.
ONC RPC: Most modern applications, especially the kind that one typically runs inside Docker containers, don't use NFS or other ONC RPC-based services.

And there's probably more that I don't know about or have forgotten. In short, musl libc is much lighter than glibc, yet musl has everything that most modern applications need. musl also emphasizes correctness and robustness more than glibc historically has. So I want to use musl.

I also think the GNU core utilities are overkill for most Docker containers. The coreutils package in Debian Wheezy is about 13 MB. The cp utility alone is 128 kB. By contrast, my whole BusyBox build is 332 kB, statically linked. My BusyBox configuration is spartan (no editor, pager, wget, or command-line editing in the shell), but it's enough for building and running containers.

My final motivation for this project is that in my opinion, downloading packages from the Internet, as most Dockerfiles do, doesn't lead to very trustworthy builds, because these builds aren't deterministic. To me, a trustworthy build should be a pure function of the input source tree and the base image. Ideally, networking should be disabled inside the build-time containers. To achieve this, each top-level source tree needs to incorporate all of its dependencies directly. As far as I know, Git submodules are the best way to do this.

How?

You can learn everything there is to know about the build processes by browsing the Git repositories:

Feel free to ask me questions if anything is unclear.

What needs work?

Some scripts are more or less duplicated between the repositories. I should factor them out into a common repository which all of the others can pull in as another submodule.

The biggest problem is that due to limitations in the Docker build system, the build process for each image can't be done entirely inside one Dockerfile. This means that these images can't be trusted builds. I have some ideas about how this might be rectified, which I've previously raised on the docker-dev group. I plan to contribute a solution.

Finally, I haven't yet built images for any applications this way, only bits of infrastructure.

Conclusion

I doubt that many Docker users will use my images, or join me in building their own images this way, but I think it was a worthwhile experiment, for my own education if nothing else. I hope someone finds this work useful.

Discuss on Hacker Newws