2017 is shaping up to be a fantastic year for production containers.
Whether they’re being deployed to Kubernetes, Marathon/Mesos, Docker Swarm, or a hand-rolled environment, lots of companies are moving to Docker in some form or another, and are often doing so with Continuous Deployments in mind. In cases where deploying efficiently is the goal, lightweight containers are an important factor, where a few small workflow optimisations can have huge benefits when optimising Docker images.
When working on Dockerised application projects, I like to keep the following in mind:
One of the beautiful things about Docker is that you don’t need a full operating system inside of containers to run your applications. Instead, Docker relies on the host server's kernel and simply loads the userland of a base image defined in the first line of the project's Dockerfile. What this means in practice is that you have access to most of your favourite distribution’s features (think package managers and directory structures), without having to load a full Linux distribution.
It’s quite common for application developers to start their images with the Linux distribution they are most comfortable with (or already run in their non-Dockerised production environment), such as Debian, which on the DockerHub is 130MB, or a public image for the language their project is built on such as Python or Golang - both of which are 675MB images on the DockerHub. These public base images are usually built with flexibility in mind, with features/packages included that can quickly bloat the image.
A quick win here is check for slim versions of your favourite base image instead of defaulting to the “latest” tag. For example, Python has a version built from Alpine Linux rather than Debian. Alpine is a minimalist Linux distribution based on BusyBox, which comes in at a tiny 4MB!
One thing to note here is that tiny Linux distributions such as Alpine don't necessarily ship with Bash installed, so you may need to make a few tweaks to your Wercker YML to get them to play nice with Wercker: http://devcenter.wercker.com/docs/faq/alpine-faq
Hand roll your base image
Starting with a small public base image is a great first crack at keeping your containers slim. However, with the cost of a little bit more thought and work, you could build a custom base image for your project that only includes exactly what you need for your application.
Your production project will probably break down in to something like:
- Low level helpers: Packages and libraries required to install your application's language. Maybe you need wget to download it, tar to untar, or GCC to compile it. These things will very rarely change.
- Application's Language: Lets say you're running Go 1.8 or Python 2.7. Patch revisions might be released fairly often, but we're probably talking months, so this also rarely changes.
- Application language helpers: Package managers like pip, composer, or npm. Again, these will probably change in a timeframe of months.
- Application Dependencies: Packages and libraries in your application's language required by your application.
- Your application: In a continuous deployment environment this could be changing multiple times per day.
Since the first three rarely change, they are great candidates to bake inside of your base image, which you could base off of one of the tiny distributions such as Alpine or BusyBox, and re-build it using your CI/CD tool. With Wercker you'd simply need to bump the versions required in your wercker.yml file, and push the change to your preferred SCM platform. Wercker could then build you a new base image, tag it, and push to your registry of choice.
The fourth item, Application Dependencies, requires a judgment call depending on your application. You could be in early-stage development with these changing frequently enough to want to not bake them in to your application's base image, and instead opt to build them as part of your application's separate CI/CD flow.
If you're building your projects inside of Wercker, you don't need to worry about crafting Dockerfiles. Your projects will be configured and built using Wercker YML files. However, If you’re hand rolling a base image and using a Dockerfile outside of Wercker, it's good practice to structure the Dockerfile to take the best advantage of Docker's layer cache, which is generated by incrementally examining each instruction inside of the Dockerfile for changes. If a change is detected, all subsequent layers will be rebuilt, so it's best to place instructions that are least likely to change higher up in your Dockerfile.
The above tips should point you in the right direction for creating slimmer, more efficient Docker containers for your continuously delivered applications. For some further reading, I recommend checking out the official Docker documentation on Dockerfile best practices, and looking in to the open source project Docker Slim, which offers some interesting approaches for minifying Docker containers.
We’re hiring! Check out the careers page for open positions in Amsterdam, London and San Francisco.
As usual, if you want to stay in the loop follow us on twitter @wercker or hop on our public slack channel. If it’s your first time using Wercker, be sure to tweet out your #greenbuilds, and we’ll send you some swag!