Getting Started with Docker – Part 1

For those new to Docker, you’re probably wondering – What is it exactly?

“Docker is the company driving the container movement and the only container platform provider to address every application across the hybrid cloud.”

OK, it’s the name of a company. The next question is, what are containers? Rather than give you another quote, I’ll give you my own definition…

What are containers?

Containers are like portable executables. You know, the applications that come as a standalone  exe file and don’t require installation? I draw this comparison for the following reasons:

  1. They are self-contained (pun intended!) units. That is to say, they come with everything they need in order to operate. You don’t need to worry about installing dependencies the way you do with non-containerised applications, and they act exactly like a single executable from a user’s perspective.
  2. They’re disposable. When a portable application is misbehaving and/or you want to upgrade it, you simply delete it and start using a fresh copy. You can do exactly the same thing with containers.
  3. When done correctly, containers only have a single concern (formally referred to as a single process). Just as a portable version of Firefox is only concerned with providing the user with a web browser, an nginx container’s only concern may be to provide a web server.

What is Docker Engine?

OK so you know how I said containers don’t require dependencies? Well, technically they’ve got one – Docker Engine. Docker Engine is used to create and run images and containers respectively, and is the tool you’ll primary use when getting started with Docker.

What do you mean, Self-Contained?

When those who aren’t well versed with Linux hear that containers are self-contained, they think it’s too good to be true. That’s because they’ve suffered through the pain of trying to install tools which aren’t available through their package managers. They’ve Googled for days, asked for advice on forums and made all sorts of file and system modifications. In the end they either give up, or they manage to get it working but have no idea how they did it.

If they were lucky enough to get it working, their joy soon turns sorrow again when they find themselves in dependency hell when they try to get the next tool up and running.

For those poor souls who are all too familiar with this situation, let me repeat – containers are completely self-contained. Each containers run its own, independent operating systems. As a result of this, each can install any version of any package they like without fear of breaking applications in other containers.

For example, let’s look at how we can run Python in a container:

There we have it. With a single command we were able to stand up a container that was built to run Python. How cool is that?

What does that command do?

Let’s dissect the command we used to spin up our container:

  • docker run: Runs Docker Engine
  • -it: Interactive & pseudo-TTY – Used to give us CLI access on the container
  • --rm: Delete’s the container as soon as it is stopped
  • --name: Allows us to name our container
  • python: Specifies the Docker Hub  the repository we want to use
  • 3.6.3-alpine3.6: Specifies the tag of the image we want to use

Note: For a full list of options, please see the Docker documentation.

Next, let’s have a look at the output that was produced as a result of the above command:

When Docker says  Unable to find image, it’s telling us that the image is not on the machine yet. The subsequent 5 lines represent the downloading of the 5 layers which make up the python:3.6.3-alpine3.6 image.

Note that if we re-run the same command again, we get dropped directly into the Python interpreter because we now have a copy of the image locally:

Wait, what’s this about layers?

Docker’s use of layers are very well documented on their website, as well as by countless bloggers, so I’ll try to make this section short, but sweet.

Dockerfiles are  used to tell Docker what it needs to do in order to create an image. For example, set environment variables, download a package, extract a tarball, etc.

Working backwards, to see the total size of the Python image (the sum of all layers), we issue the following command:

To see the layers which made up the image, we issue the following command:

And finally, if we’re interested in seeing exactly what was done to create the final image, we can check the Dockerfile(s) themselves. For example, the  python:3.6.3-alpine3.6 Dockerfile can be found here. Note though as per the very first line in that file –  FROM alpine:3.6 – our Python image is building on top of the Apline 3.6 image. Therefore, we’d need to look at its Dockerfile as well.

It’s worth noting that the 5 layers we see with sizes attributed to them in the last output above, are the same 5 layers we saw being downloaded on the initial run of the container.

On a side note, as per the Docker documentation – The missing lines in the docker history output indicate that those layers were built on another system and are not available locally. This can be ignored.

More on layers & Dockerfiles

For more, very useful information on layers and to get the most out of the image cache, have read of these articles:

Summary

Wow, this post went a lot deeper than I had originally anticipated! Let’s quickly run through what was covered.

  • Containers are self-contained units
  • Each container should do one job, and one job only
  • Images are made up of layers
  • All layers of an image are read only, except the top layer which is read & writable. This top layer is referred to as the container
  • The number of layers and what each of them do are determined by the contents of the dockerfile that is used to create them
  • Because all layers (bar the top one) is read only, they can be shared between containers resulting in great efficiencies
  • dockerfiles should be written in a way which optimises the cache
  • The --no-cache flag should be used when fresh data is required
  • Docker Engine is used to turn  dockerfile instructions into an images. It is also used to run containers

Update

Part 2 of this series has been published.

As always, if you have any questions or have a topic that you would like me to discuss, please feel free to post a comment at the bottom of this blog entry, e-mail at will@oznetnerd.com, or drop me a message on Twitter (@OzNetNerd).

Note: This website is my personal blog. The opinions expressed in this blog are my own and not those of my employer.

Leave a Reply

Your email address will not be published. Required fields are marked *