A Non-authoritative history of Preemptive Multitasking in the personal computing world.

Back when machines only had one or two CPUs (still the case for embedded devices) the OS Kernel was responsible for making sure that the machine coule process more than one instruction “path” at a time. I started coding back on the Commodore 64, and there it was easy to lock up the machine: just run a program that does nothing. I’d have to look back at the Old Programmer’s Guide, but I am pretty sure that a program had to voluntarily give up the CPU if you wanted any form of multi-tasking.

The alternative is called “preemptive multitasking” where the hardware provides a mechanism that can call a controller function to switch tasks. The task running on the CPU is paused, the state is saved, and the controller function decides what to do next.

Continue reading

Debugging Techniques

Always look at the top error message in the spew that comes out from the compiler. This is usually where the problem is triggered. Junk after that is often spurious, especially if the first problem is a syntax error.

Use a revision control system like Git or Mercurial, and make a checkpoint of your code when ever you have made a significant addition. This way, you don’t get into the situation where you had it working, you broke it, and you can’t get it running again. Git has a little bit of a learning curve, but it rocks. If you are comfortable posting the code where other people can see, you can push your changes to github, and then if you have a catastrophic machine failure, you don’t lose your work.

The binary search is your friend. If you know that the error is somewhere between lines 10 and 20, Comment out lines 15-20 and see if you still have the problem. If so, comment out lines 13 and 14, and so on. A binary search means that you can search 1000 lines of code in 10 comparisons, and save yourself a metric shit-ton of time.  A metric shit-ton is an industry term.

Step through your code in a debugger, line by line, to make sure you know what it is actually doing, not what you think it is doing. Very illuminating. If you don’t have access to a debugger, make liberal use of trace statements. In gcc I often use:

#define TRACE() printf (“%s:%s:%d\n”,__FILE__,__FUNCTION__,__LINE__)

There is something comparable in most languages.

Always work from success. When I start a new C program, I start from

int main(){ printf (“Hello, World.”);}

And compile and run between each minor change.

Don’t be afraid to ask for help. A second set of eyes will often quickly see what you missed. Even more so, when you explain the code to someone else, you often solve the problem yourself. This is known as “verbal debugging.”

Entangled Dependencies

Our LDAP Client has a wrapper for creating a persistent search.  In order to execute a persistent search, a polling mechanism has to call the client code with notifications of changes from the LDAP server.  This means a threading library.  The call looks something like this:


Here the call to GetApp is completely superfluos to the logic:  we should not care where the ThreadPool comes from.  Instead, LDAP client should either take a thread pool pointer in its constructor, or a thread pool should be passed as a parameter to CreatePersistentSearch.  I prefer to resolve all dependencies like this at object construction time.

GetApp throws an exception if it is called prior to a call to AppImpl::Init.  Init reads an XML based config file.  So our dependencies now include both the config file and the xml parser on top of the App object.  The LogObject and LogName are also initialized in the App.

What we are seeing here is how dependencies get entangled implicitly.  In order to reuse the LDAP client, or to create a Unit Test for it, I have to initialize an App object, which is far beyond the scope of LDAP.

Continuing on looking at the ThreadPool, I see that the AppImpl actually creates the ThreadPoolWin32 object by passing the config file view to it, and the config file view is used to fetch values for the state of the thread pool value by value.  Example:

_config->Get(“TaskMax”, &_maxWorkerThreads);
_minIOThreads = 1;
_config->Get(“IoMin”, &_minIOThreads);
_maxIOThreads = 2 * _maxWorkerThreads + 1;
_config->Get(“IoMax”, &_maxIOThreads);

The binding of the values to the object should be external of the constructor, as it is just one Initialization scheme.  What if we want to reuse this object and read the values in from a database, or from a different config file format?

The LDAP Client should have a field that is the ThreadPool base class.  For the Unit test, we could mock this out. Of course, maybe the persistent search itself should be its own class.

Parsing and Inversion of Control

Parsers are essential to the interface layers of an application. A message based system requires a socket listener that will convert the contents of the stream from the network into usable collection of objects. In an inversion of control container, these request scoped objects should be registered instances in the container in order to build/fetch more complex objects, potentially of longer lived scope. Parsed request scope objects should be of classes that enforce invariants, but often will be simple strings in language specific form; char * or std::string being the most common for C++.

Take the example of a userid passed in as part of the request. There really is no format that this string conforms to other than, perhaps, some constraints of the system. To create an immutable UserID class may really just force casts to char * or std::string when the UserID is to be used.

There are many objects, specifically request scoped objects, that need to be created based on the runtime values provided by the system. In a pure Inversion of COntrol (IofC) environment, the parser would create each, and then add them to the container. This may require a large amount of coding in a manner specific to the IofC container. One goal of IofC is to minimize the amount of work that is specific to the container.

Many API’s handle this by creating a map. In the Java HTTP Servlet API, the request object has two string to string maps, one for parameters and one for cookies. This may be a good middle ground between two APIs. A better approach is what Struts does using the Apache Jakarta PropertyUtils API. The Action field of a form specifies an URL that in turn starts with a java object of type Action. THe action is configured in an XML file (Ugh!) that specifies the subclass of ActionForm. The request parameters are bound to the properties of ActionForm using the Java Bean coding convention. Thus a parameter user.homePhone.areaCode =415 would force the reflection equivalent of:


One problem with this implementation is that any exception thrown at any point of this binding would force halt to the entire setting of parameters. The net effect is lots of Struts specific coding. Personally, I am not a big fan of getter/setter dependency injection, as it tends to violate the precept that classes enforce invariants. It doesn’t have to, just that code written that way tends to be of the nature where an object can have an invalid state until the setter is called. However, the setter approach does work well with builders, if the set method can be called multiple times in the role of “BuildPart.”

When discussing the marshaling layer we can often think in terms of combing interpreters with builders. The interpreter is responsible from converting from the marshaled format to object format for simple data objects. Each of these objects is added to a larger complex object. In terms of the Builder pattern, the interpreter plays the role of the Director. Only once the entirety of the message is parsed will the director call GetResult. Exceptions thrown during the parsing of the message are collected up so that they can be reported in bulk.

One common objection to this style of programming is the cost of multiple exceptions thrown during the parsing stage. Reducing the number of exceptions thrown should be a performance tune performed after the system is operational. The general structure is to split the parse of a data object into a three stage process. First create builder that takes the object in string form. Second, ask the builder if the marshaled form is valid. Third, fetch the object from the builder. If stage two returns false, add an error to the collection of errors and short circuit the building process. Note the higher degree of complexity over performing the parsing in the constructor. The constructor has to validate the format it is given from the builder, or has to know about the builder object, creating a bidirectional dependency. The logic in calling the builder and the return code has to be coded by hand or has to fit into some larger framework.

The degenerate case also seems to be the most prevalent: confirm the format of the data objects, but then transfer them around as strings. The problem with this approach is that each layer of the application will be written not trusting the other layers, and the objects will be re-parsed, potentially following different rules. From the increase in code size, complexity, and potential for error, we can infer that we should avoid this approach.

Since the builder can be created with minimal dependencies, and most of these can be defined statically, it should be a request scoped component defined in the IofC container. We have to decide at what point in the parsing process do we switch from a general purpose parser to a specific class dedicated to the data coming off the socket. Ideally, the message format provides a way to specify the message type and version early in the stream. This can be the key used to fetch the specific parser. The stream coming off the socket should conform to one of the standard types of the language. In C++ this means something out of std::. The parser can depend on the builder. After the parsing process, one last call into the IofC container kicks off the business processing, that which takes the parsed message and does something with it. At this point we can use something as simple as a functor: Everything is in the IofC layer.

There is a lot here subject to interpretation. The stream may or may not be encoded according to a public protocol, a language specific feature, XML, UUEncoded, encrypted, compressed, and so on. A pipeline can handle the transformation. Converting from stream to objects may use this pipeline, or be a portion of the pipeline, converting from a character or binary based stream to a stream of objects. The issue here is always how to get the objects available to the container without too much container specific logic. If the value comes in key value pairs, the key needs to provide enough information to determine which instance of which class to create.

Since your inversion of control layer should be minimally invasive, you don’t want to have each call in the building of the input validation require knowledge of this layer.  Something about the network protocol is going to be unique enough to pull in the processor for your code.  This is going to be the  URL, the message type, or something unique to your protocol.  The should be enough to select the interpreter and the  builder objects, connect them, and let the parser become the pump from there on out.  The builder should registered  in the Inversion of COntrol container.  When something needs the end product of the build process, it should get this from the container, which will delegate to the builder.  This should strike the right balance between type safety and flexibility.

Computers and Me

The defining question of geek culture before the .com boom was, ‘What computer did you program on first.’ Before Microsoft became ubiquitous, there a period where many different systems, all incompatible, became available within the price range of the average family. Brian Graber worked on his Dad’s IBM PC, Cristin Herlihy had an Apple II, the O’Neil’s had an Atari computer (they had the game console, too). My cousins from New York lent us a Commodore VIC20 with a two volume set on teach yourself BASIC. My cousin Christopher came to visit for a week, and ended up staying for the summer. I read out loud out of the books and he typed. By the end of the summer, we were able to program our own text based adventure game.

Even more impressive, we could perform such amazing feats as turning the background and foreground color to black, making text entry difficult. This minor bit of wizardry was performed by using the the arcane command poke. The format was poke memory address, value. It allowed you to program at an incredibly simple level. Note I said simple, not easy. You could set any memory address on the machine to any value you wanted. Once you knew where the memory location was that controlled the text color, or the background, you could produce magic.

The VIC 20 returned to New York at the end of the Summer, but the Holidays brought along a Commodore 64 and a subscription to Computes’ Gazette. A month or two later, I talked my mom into subscribing for the Disk that accompanied the magazine. Now, you may accuse me of being lazy, but most of the programs they release were nothing more than a long string of poke instructions to be typed in. They even released a checksum program, to make sure that the numbers added up to the expected values, but I never go the Canyon Crawler program to run correctly. The Gazette, in addition to a word processing program and a slew of video games, published two tools that were very instructive. One was a font editor, and the other a sprite graphics editor. With these simple tools, you could make video games that were arcade quality (1985 arcade quality, that is). My first video game was a spy game, where you had to parachute down between two roving searchlights. If either touched you, you fell to your doom. Programming this required using the other most arcane of instructions, peek. Peek told you the value of a memory location. Armed with the peek command and the address of the joystick port, I could move the parachute left and right, while it drifted ever downward.

In retrospect I should have stayed with the Parachute idea. On the next screen you might have had to parachute onto a moving boat, or a bouncing trampoline, or perhaps avoid a flock of geese. However, I wanted to make a game that scrolled. I had a vague idea that maybe I could reset the CPU to look at any memory location for its character map, and coupled with a really cool font set you could wander through a maze of building looking to steal secret codes. What I didn’t know was that this type of machine was based on memory mapped IO. Certain fixed memory locations were actually just links for other processors, or input and output devices. There was no way to change where the CPU looked for the character map, as it was the result of the underlying electronics.

I was frustrated by the limitations of BASIC. I wanted to know what all those peeks and pokes were doing. Once I started reading about assembly language programming, I realized that the coders at Compute were distributing, not source code as they would for a program written in basic, but a sort of executable. The C64 only knew how to load and run basic programs. These long listings of pokes were actually copying instructions into memory. Not just color codes for the background or bitmaps for sprites, but instructions like, ‘Load the value from this memory location into the X register.’ I had no idea what a register was, but still, this was pretty cool. The only problem was that I never found an Assembler for the Commodore, so my hacking was limited to converting instructions into numeric codes, and loading them in by hand: my learning mostly theoretical.

I mentioned Cristin Herlihy had an Apple II. This became significant during my senior year of high school when I took a structured programming course in Pascal. I spent long hours over at the Herlihy’s debugging programs to do simple text based operations. The cool thing about Pascal over both Basic and Assembly was, get this, you didn’t need line numbers. GOTO, the standby command for BASIC programming, was forbidden by our teacher. I had learned subroutines and looping before, but you got to call everything by a friendly name like ‘do_something’ as opposed to calling with the cryptic GOSUB 65000. Also, we had floating point numbers. But where were the graphics? I never learned that, as it wasn’t on the AP exam. Programming became more practical, but more removed from the reality of the underlying hardware. It must have been a good course, though: I managed to get a 5 out of 5 on the Advanced Placement test.

After toying with the idea of going into music (I was a fairly serious Jazz Saxophone player in high school), I ended up going the opposite extreme: The United States Military Academy at West Point, or, as I tend to call it, Uncle Sam’s schools for delinquent children. The 5 on the AP test got me out of the first two levels of Computer Science, and into the Data Structures and Algorithms. Now instead of working with floats and strings, we were working with linked lists, arrays, stacks, and heaps. We learned how to sort and search, but more importantly, we learned how to analyze algorithms. I took the the standard set of courses: Language Theory, numerical analysis, discrete mathematics, operating systems, software engineering, and so forth. By having opted out of the first two classes, it opened up more electives at the latter part of the program. I got to take compilers, graphics, artificial intelligence, and databases. I was well armed to enter the workforce as a programmer.

Except that I entered the Army as an Infantry Officer. For the next several years my interaction with a computer was primarily via Calendar Creator and Microsoft Office. One time, I needed to copy a file from one computer to another, and it was too big to fit on a single floppy, so I wrote a short Pascal program that cut the file in half, byte by byte, and another that put it back together. I eventually got an America Online Account, as I hadn’t had email since graduation. Information systems at the lowest levels of the Army were still based on the time honored tradition of filling out a form and putting it in the inbox. The primitive systems worked, to a point. I learned what it really meant to be an end user. Using the applications at our disposal, we built better systems, planning training and tracking soldiers administrative needs in home built systems. We did unspeakable things with Excel spreadsheets and Powerpoint presentations. Division Headquarters had a scanner, and I showed out operations officer how we could scan in the maps and draw operational graphics on them electronically.

My first job out of the Army was at Walker Interactive Systems, a company that built accounting software that ran on IBM mainframes. The group I worked in built applications that ran on Windows machines that ran the transactions on the mainframe. My team supported the infrastructure that made communication between the two worlds possible. The mainframe stored its letters using a mapping called EPCIDIC, the Windows machines used ASCII. Even more confusing was the way the two systems stored numbers. Back on the Commodore 64, I only had to worry about a single byte of data. But Systems had grown so that a number was stored across four bytes. For historical reasons, Microsoft decided to store the least significant part of the number in the first byte, and the most significant part of the number in the last byte. IBM chose to store it the other way around. To avoid having to deal with these problems in the buffers we were sending, the architects had decided that all numbers would be sent in their string representations. While we might send a positive or negative sign, we never sent around decimal points. A certain field was just defined as 10 digits long, with the decimal point assumed to be between the eight and ninth digit. Dates had four different formats: Julian, Year Month Day, Day Month year, and that barbaric American format Month Day Year. The system was designed so that we would package up a large amount of data, write it into a buffer, and send it across the network to the mainframe. The Mainframe would plug and chug and send back the data in another buffer. This type of transaction mapped to another technology that was justing make inroads; the Hypertext Transport Protocol, the underlying workhorse of the World Wide Web.

One thing about developing code is that sometimes you are so busy you don’t know how you are going to get things done, while at other times you are just waiting for someone else to finish, or just waiting. During a long period of downtime, I got hooked on web comics. One of them, Userfriendly.org, touted the virtues of Open Source software and the Operating System built around the Linux Kernel. Intrigued, I found an old Pentium 100 and purchased a Copy of Red Hat 6. While the knowing out there might scoff at me paying for free software, it proved to be a great investment. This was my entry into the world of Free software. When I had booted that Commodore 64, instructions that had been burnt into read only memory would execute, making it impossible to tell the computer to do other things. With Linux, I had access to this same type of code, but now with the ability to look through it and change it. I learned how to compile my own Linux Kernel. Because the Ethernet Card that came with the machine was not supported by Red Hat, I had to get code from the source and compile it in myself.

In this case, the source was a guy named Don Becker, who worked for NASA. His project was making a Supercomputer by linking together lots of little computers. In a nod to his Nordic ancestry, he named it after one of the heros of Germanic legend: Beowulf. Because his Beowulf was built more like Frankenstein’s monster, sewn together from many different pieces of available hardware, he needed to be able to use all the various types of hardware he found. The Linux Kernel allowed him that flexibility. The price for the use of Linux was that, if he distributed the executable, he had to distribute the source code as well. Don became the Guru of Ethernet device drivers for Linux. This is what is known, in business speak as Win-Win. Linux and its community won because it got good drivers. Don won because he was able to build his supercomputers and spin them off into a company that specialized in Beowulf clusters. More on that in a bit.

Just before leaving Walker, I looked into rewriting the Client side of our code using a language that was really getting popular: Java. Java was yet another step away from the hardware. As a language, it was not designed to be compiled to the instruction set executed by the CPU of the machine it ran on. Instead, it was converted to a very simple set of instructions that were interpreted at runtime into the CPUs instruction set. This final step is what made Java so portable. Now your code, once compiled, could run on any machine that had a Java Virtual Machine installed. There were limitations, of course. It ran slower than code compiled for a specific CPU. The graphical user interface layer, called Swing, was especially slow. So it never really caught on for client applications (although right now I am using Open Office Writer, a Swing based word processor to type this). It was, however, a perfect fit for business logic processing, especially web site development.

So I, along with the rest of the San Francisco Bay Area, learned to develop websites. The first was Tavolo, the second incarnation of what was originally digital chef. Tavolo was a specialty food and cookware website developed by the Culinary Institute of America, or as they like to be called, the CIA. We wrote their new website using a product called Dynamo, from the Art Technology Group. Dynamo was an application server. It was a program designed to run other programs, and many of them at once. Dynamo had components for personalizing a website based on the person who used it, and a significant amount of support for ECommerce. Many of the solutions to these problems ATG put into Dynamo were parallel to, but different from, the solutions that eventually became the standards put out by Sun for Java enterprise computing. Since the marketing people at Sun decided that Java need a second version this became Java 2 Enterprise Edition, or J2EE. Maybe they thought is sounded better than JEE.

As these standards got better and better developed, various people started implementing them. Some were companies trying to sell their implementations. But many people who were doing Java programming released their code under various open source licenses. The most popular, the Tomcat Web Server was developed under the auspices of the Apache organization, the same folk who made the Apache web server. JBoss, (renamed from EJBoss due to Copyright Issues with Sun) was the transaction server and database wrapper. These performed the same job as Dynamo, but were free. Additional packages existed for various stages of website development, database access, document generatation and more. I now had open source code for an operating system, and for all the software I needed to build Enterprise Software. As the dot-com bubble burst, I headed to a small company that needed a website built. Using this stack of open source software, we brought up the website in a few weeks, and grew it over the course of the following year. All of my follow on projects have used this mix of Java and open source software.

The secret to Java’s success is also one of its shortcomings. Java comes from a long line of programming languages that try to make it hard for the programmer to do the Wrong Thing. In particular, Java allows you to use memory without having to clean up after yourself. Once an object is no longer referenced anywhere in the system, it is eligible for garbage collection. While there are numerous other features that make Java a good language in which to work, this is the one that most contributes to productivity. The drawback is that sometimes you need to know exactly where memory comes from, how long it can be used, and when it can be reclaimed. In Java, memory is difficult, if not impossible, to access directly. Probably most telling is the fact that Java is not programmed in Java, it is programmed in C and C++. Because something as critical as the Virtual Machine that Java runs on has to be fast, or all programs are slow. Where Java takes the position that programs should check for and report errors to speed development, C requires a much more dedicated quality assurance process to make sure the programs don’t have an unacceptable amount of bugs. Not that you can’t write fast code in Java, and not that you can’t quickly write bug free code in C, It is just that each language makes it easier to do its own thing.

So I made the effort to break out of the very successful track I was in, take a cut in pay, and get in to Linux Kernel development. In a sense, this was a return to my roots, being able to go right to the hardware. I spent quite a long while looking, when opportunity found me. A recruiter called me from Penguin Computing. Penguin is a hardware company, they sell Linux Servers. Cool. About a year ago, they bought Scyld. Scyld was the company spun off from Nasa’s Beowulf project, lead by Don Becker. Itold you there would be more later. The geek value was immense. I was hooked and convinced them to hire me.

Why was I drawn to computer science? I like patterns. I like being able to hear the chords of “Always look on the Bright Side of Life” and realizing they are the same as “I got Rhythm” just with the Chorus and verses reversed. I like trying to tell which of my nephew’s personality traits came from his mother and which came from his father. When it comes to programming, I like taking a solution, and extracting the generic part so I can extend it to solve a new problem. Design Patterns work for me. I’ve been interested in many portions of computer science, and enjoy learning the commonalities between tuples flowing through portions of a query, packets flowing through a network, and events flowing through a graphical interface.

The one topic in my course on Artificial Intelligence that really piqued my interest was neural networks. After several decades of trying to do it the hard way, scientists decided to try to build a processing model based on the brains of living organisms. Animal brains do two things really well. First, they process a huge amount of information in parallel. Second, they adapt. Traditional neural networks (funny to be calling such a young science traditional) are based on matrix algebra as a simplification of the model. One vector is the input set, multiplied times a matrix gives you an interim result, and then multiplied by a second matrix gives you an output set. The matrices represent the connections of neurons in the brain. At the start of the 1970s, scientists were convinced that Neural Networks were the big thing that was going to get us Artificial Intelligence. But traditional neural networks learn poorly and do little that can be called parallel processing. After a brief time in the sun, they were relegated to short chapters in books on AI. They are still used, but people no longer expect them to perform miracles.

If you believe that upstart Darwin, real intelligence is the result of millions (or some greater illion) years of evolution. Expecting a cheap imitation to learn to perform a difficult pattern analysis with a short amount if training is either a case of hubris or extreme optimisim. If I had to guess, I would say both. Around us are a vast (albeit dwindling) variety of animals that all have wonderful examples of neural networks. We are lucky in that we have such great models to work from, we should learn from them. I would like to use a neural network model as a starting point for a processor that learns and moves like a living creature. Recent work with hardware based neural networks have performed superbly at voice recognition. The focus on the timing between the neurons, an aspect not accounted for in the simplified model, was a key differentiator. The animal brain is superb at cycles such as the motion of the legs while running. Once the basic cycle is learned, the system will be taught to adjust for rough terrain, different speeds, and quick changes of direction. If the behavior of a single muscle is analyzed, we see it has a pattern of contracting and releasing timed with the activity it is performing. The brain controls all the muscles in parallel, as well as absorbing input from the various senses. This cycle can be seen as a continuously adapting system built out of: 1) a desired process (running), 2) the state of the muscles and other organ systems, 3) a prerecorded expectation of the flow odf the process, 4) and the inputs to the senses. In order for a cycle to progress, some aspect of the output must be fed back in as input. Additionally, a portion of the system must remain aloof and compare the actual end result with the desired end result, using that to tune the behavior of the system. The best result will come from an interdisciplinary approach: the system should be engineered as a mix of software and hardware, traditional engineering techniques and genetic algorithms, using everything learned from biology, especially animal physiology. The latest advancements in materials science will be needed for making motive systems that get maximal energy for minimal weight. Currently, we can program a robot that can walk. I want to develop a robot that can learn to run.

And to run it will take great advances in Operating Systems. An animal receives and processes a vast amount of information from all its senses at the same time. Layers upon layers of transformations turn this information into action. Future events are predicted in space-time with a high degree of accuracy and an even higher degree of fault tolerance. Some of this is reflected in the way that current robotic systems work, but we have much to learn. We need to develop systems where parallelism moves from being a difficult concept to handle to the primary tool of development.


Godel, Escher, Bach is one of those books I had heard about forever.  I am not sure if I would have had the tenacity to complete it if I hadn’t already had a grounding in Computer Science.  Having had courses in Computer Theory made the book go from a treatise on computer science, to a narrative that tied together many ideas into a more cohesive whole.

Continue reading

My Ideal Technology Setup for work

“Since I’m dreaming, I’d like a pony” –Susie, in Calvin and Hobbes.

“I’m not just the President of the Hair Club for Men, I’m also a client.” –President of the Hair Club for Men

Not only do I write software, I use it. A whole bunch. I am a Linux guy, and when ever I end up in a situation where I have to work around a proprietary solution that just doesn’t make sense for what I am trying to do, it ads a point or two to my Diastolic. So here is my dream setup:

Continue reading

Dependency Injection in C++

One goal of object oriented programming is to encapsulate functionality within a class.  If one class requires another in order to collaborate on a larger process, the programmer must deal with the wiring up of the two classes.  Much has been written about the correct way to do this in various object oriented languages.  The term “Inversion of Control” and the related concept of “Dependency Injection” have become part of the common language of software development, at least in the Java world, due to projects like Pico/Nano container, The Spring framework, and earlier efforts inside J2EE and ATG Dynamo.  These frameworks make use of the introspection mechanisms in Java to create instances of classes based on demand criteria.

Updated: links and using the code formatter. Changed Class from Single Letter to full names to be clearer and to not conflict with HTML tags.

Continue reading

Small Scale High Performance Computing

At the top end of computing there is are the Supercomputers.  At the bottom end there are embedded devices.   In between, there are a wide array of types of computer systems.  Personal computers, workstations and servers are all really just a sliding scale of the same general set of technologies.  These systems are , more and more, the building blocks of the technologies higher up on the scale. Enterprise computing typically involves high-availability and high Input/Output (I/O) based systems.  Scientific and technical computing is similar, but high availability is not as important as performance.  Three of the variables that factor into system design are parallelization, running time and (disk) storage requirements. If a job is small enough that it can run on a single machine in a reasonable amount of time, it is usually best to leave it to do so.  Any speedup you would get in parallelizing the job and distributing the workload is offset by (Amdhals law) the serial portion of the job, the added overhead of parallelization, and the fact that you could run a different job on the other machine.  If your task is parallelizable, but is very storage intensive, you need a high speed disk interconnect.  Nowadays that means fiber channel.

Only if a job takes so long that it makes sense to parallelize, and that job does not require significant access to storage does it make sense to go to a traditional Beowulf cluster.  Although Infiniband does handle the interconnect for both network and storage access, the file systems themselves do not yet handle access by large clusters.

This is the point for which we need a new term:  storage bound, single system jobs that should be run on their own machine. Examples of this abound throughout science, engineering, enterprise, and government.  Potential terms for this are:Small Scale HPC, Single System HPC,  Storage Bound HPC, but none of them really roll of the tongue.