Containers from first principals

Computing is three things: calculation, movement, and storage. The rest is commentary.

What are containers? I was once told they were “just” processes. It took me a long time to get beyond that “just” to really understand them. Processes sit in the middle of a set of abstractions in computer science. Containers are built on that abstraction. What I’d like to do here is line up the set of abstractions that support containers from the first principals of computer science.

Computation is simple math: addition and the operations built from it like subtraction and multiplication, and simple binary tricks like left shift which are effectively forms of multiplication.

A CPU takes a value out of memory, performs math on it, and stores it back in memory. Sometimes that math requires two values from memory. This process is repeated endlessly as long as your computer is on.

Storage is the ability to set a value somewhere and come back later to see that it has the same value. If maintaining that value requires electricity, we call it volatile memory. If it can survive a power outage, we call it persistent storage.

The movement of information from one location to another involves the change of voltage across a wires. Usually, one value is used to select the destination, and another value is transferred.

That is it. That is the basics in a nutshell. All other abstractions in computer science are built from these three pieces.

One little quibble: there is a huge bit I am skipping over: interactions with the outside world. Input, from sensors, and various parts of the output story as well. I’ll just acknowledge those now, but I’m not going to go in to them in too much depth.

If computation could only transform existing data, it would be useless. Devices that interact with the outside world makes computation capable of reacting to and inducing change in the real world. Want to draw a picture on the screen? You need to store the right values in the right places so the graphics hardware can tell the monitor what value to set on a pixel on the screen. Want to tell the computer what to do? Press a button on a keyboard that changes a voltage in a circuit. As soon as possible, these changes become just more computation.

When computers just started, back in the 1940s or so, there was very little abstraction. The output from a computer was produced by technology not much different than a typewriter. Instead of human fingers, a solenoid can depress a key. Want to type an ‘A’ character? Send a pulse to the right solenoid. Input came from switches, not much different than sending the wakeup message to your light-bulb

In the intervening years, the complexity of how we generate that output has exploded. The Keyboard that I use to type this article contains within it a full computer, that talks to the outside world via the USB protocol. The monitor I read from as I type contains a different computer that talks HDMI. In between is a laptop Capable of those protocols, TCP/IP, Bluetooth, 802.11 and more. These systems are possible because the protocols, the pre-agreed meaning of a sequence of signals, are well understood and implemented by the manufacturers. The basic ideas of compute, movement, and storage are used at all layers of implementing these protocols.

Early computers were given one task at a time. Most of those tasks were mathematical in nature, such as computing tables of logarithms or settings for cannons to fire at targets at different ranges and elevations. These tasks, which were both time consuming and error prone when performed by humans, were sufficient to keep these computers occupied. A small amount of time was required to load in a new program, or to read out the results of the old one. As the value of the computer time grew, additional mechanisms allowed the computer operators to queue up batches of jobs, so that the machine could immediately start the next once the previous one finished.

When a human manually loads and unloads a computer, there is not much call for a naming structure for resources. However, once the loading and storage of resources is automated, a second process needs a way to find a resource produced by a previous. While there are many ways this has been implemented, one of the most common and easiest to understand is a directed-acyclic-graph (DAG) with a single root node. All resources live as nodes withing this graph. We call these nodes directories, unless they are end nodes, in which case we call them files.

To find a resources, you start at the root, and navigate from node to node until you find the resources desired. You recognize this as the filesystem structure of your computer. We can produce a name for a resource by building a path from the root of the tree to the resources itself. if they are such as find an executable file a the node /usr/bin/xz. This path traverses from the root node (/) to the bin node, and finally identifies the xz end node.

Further complexity emerged as computers became more powerful and we expected them to do more complex tasks. As the amount of a space a computer required decreased, engineers put more compute power into that space. Where before there had been one processing unit, there might now be two, four, or more. These processors were linked together, and programmers came up with ways of splitting up mathematical tasks to use multiple processors at once.

Computers were also monitoring sensors in the outside world. When a sensor hit a threshold, it would send a signal to the computer. The computer needed to be able to pause what it was doing in order to be able to handle that sensor. We call these signals “interrupts” because they interrupt the flow of execution of a program. When a processor takes an interrupt, it stores the values it was working on, and switches to another program that contains the steps to handle the interrupt.

Taken together, the ability to have multiple processors and the ability for a single processor to handle an interrupt provide the basis for sharing a computer. The simplest sharing is that each person gets their own processor. But sometimes one person needs many processors, and another person only one. Being able to transfer the workloads between processors becomes important. But even if a single person is using the computesr, she might want them to perform different tasks, and make sure that the results are not accidentally mixed. The ability to time share a computer means that the resources of a computer need to be divisible. What are those resources? Essentially, the same three we started with: computation, movement, storage. If the processor is working on one task, it should not see instructions from the other tasks. If the processor is storing a new value in memory, it should not over write a value from another task.

The simplest sharing abstraction for a single processor is called “threading.” A thread is a series of instructions. If there are two threads running, it might be that one thread runs on one process, and a second runs on a different one, or both might run on a single processor, with an interrupt telling the processor when it is time to switch between the threads. The two threads share all of the resources that are used for the larger abstractions. They see the same volatile memory. They share the same abstractions that we use to manage movement and storage: network devices and names, file systems and devices, and so on. The only thing that distinguishes the two threads is the current instruction each one is processing.

If we wish to keep the two threads from overwriting each other’s values, we can put the next abstraction between them: virtual memory. When we have virtual memory, we have a subsystem that translates from the destination locations in the process to the physical locations in memory. That way, if each of our now-separated threads want to store and retrieve a value from memory locations 4822, they will each see their own values, and not the values of the other thread. When we add memory separate to threads, we have created the abstraction called a “process.”

If the only abstraction we add to the thread is virtual memory, than the two processes still share all of the other resources in the system. This means that if one process opens and writes to a file named /ready, the other process can then open and read that file. If we wish to prevent that, we need a permission system in place that says what a given process can or cannot do with outside resources. This access control system is used for all resources in the DAG/Filesystem.

What if we want to make sure that two process cannot interact even inside the filesystem? We give them two distinct Filesystems. Just as virtual memory puts an layer of abstraction between the process and the physical memory, we ut an abstraction between a process and its filesystem. There might be other resources that are sharable and not put into the filesystem, such as the identification number system for processes, called process ids, or PIDS. The network devices have separate means of identification as well. We call each of these things “namespaces.” When we give two processes distinct namespaces, and ensure that they cannot interact with processes that have a different set of namespaces, we call these processes “containers.”

This whole article was to build up to the understanding of containers from the simplest possible abstractions. From computation, movement, and storage, we can see how greater degrees of complexity and abstraction grow up until we have the container abstraction.

Adam Young's Web Log

The Notebook of a Programmer Climber Musician Ex-Soldier Woodworker and a few other things

Leave a Reply