Working around an NFS hang

While Sun might have wanted us to believe “The Network is The Computer” the truth is that we often only need the network for access to stuff a brief points in time, and can get away with doing our real work on our local machines.  One of the systems at work went down today and is still currently avaiable.  This machine was exported several directories which I and some of my co-workers have NFS mounted.  When it failed, basica utilities on my machine were no longer functioning.  I tried several simple commands: like clear, ls, and which.

The problem was that my PATH environment variable had the remote machine before local machines.  This was required by our build process so that things like make would resolve to the correct version, consistent  across all machines.  When you type a command in at the command line, the shell resolves the command by trying each directory specified by the PATH variable.  In this case, the very first directory was not only failing, but hanging.

One trick that helped to confirm the problem was using echo * on a directory to see what files were there.  Since the command echo is built in to the shell, it does not cause any path navigation.  To view /usr/bin you execute echo /usr/bin/*.

To work around the problem export PATH=/bin:/usr/bin:$PATH. With that, basic utilities are once again resolved on the local machine.  Once the NFS server comes back up, exit out of you shell and create a new one, or re-export your path once again.

Small Scale High Performance Computing

At the top end of computing there is are the Supercomputers.  At the bottom end there are embedded devices.   In between, there are a wide array of types of computer systems.  Personal computers, workstations and servers are all really just a sliding scale of the same general set of technologies.  These systems are , more and more, the building blocks of the technologies higher up on the scale. Enterprise computing typically involves high-availability and high Input/Output (I/O) based systems.  Scientific and technical computing is similar, but high availability is not as important as performance.  Three of the variables that factor into system design are parallelization, running time and (disk) storage requirements. If a job is small enough that it can run on a single machine in a reasonable amount of time, it is usually best to leave it to do so.  Any speedup you would get in parallelizing the job and distributing the workload is offset by (Amdhals law) the serial portion of the job, the added overhead of parallelization, and the fact that you could run a different job on the other machine.  If your task is parallelizable, but is very storage intensive, you need a high speed disk interconnect.  Nowadays that means fiber channel.

Only if a job takes so long that it makes sense to parallelize, and that job does not require significant access to storage does it make sense to go to a traditional Beowulf cluster.  Although Infiniband does handle the interconnect for both network and storage access, the file systems themselves do not yet handle access by large clusters.

This is the point for which we need a new term:  storage bound, single system jobs that should be run on their own machine. Examples of this abound throughout science, engineering, enterprise, and government.  Potential terms for this are:Small Scale HPC, Single System HPC,  Storage Bound HPC, but none of them really roll of the tongue.

Soldier Design Competition at MIT

Last night, USMA and MIT went head to head in a design competition. The details are here:

It was cool to be with the Cadets and MIT Students  in such a creative environment.  The designes were smart, focused, low cost, and viable.  Not all of them could be deployed as-is, but even those furthest from from field ready had something to contribute to solving the problems that soldiers face in the field.  While there was not a lot of cross talk between competitors, I think the real value of a compeition like this would be the cross breeding of ideas.

Two different teams provided solutions to trying to keep soldiers cool, in order to prevent heat casualties.  In both cases, the teams approached the solution from trying to cool off the head.  The MIT team made “cool pack” inserts that replaced a portion of the pad in the Kevlar Helmet.  THe packs were activated by punching them, starting an endothermic chemical reaction.  The packs in the display room registered 56 degress, well below the 75 degree or so room temperature.  The problems with the design were that the packs didn’t last long enough, and the helmet had to be removed in order to replace the pads.  That Cadet team created an insert that was composed primarily of lightweight aluminum (There should be another I in that word, dammit!) that acted as a heat conductor.  Small cartridges  at the back of the helmet made of sponges activated the system by evaporation.  The problems with this design were the requirement for low ambient humidity (not a problem in Iraq) and the weight of the solution.  However, What occurred to me is that you could combine the two solutions, use the cold pack to power the conductor, and get the best of both worlds.  I suspect the final design will be somewhere along those lines.

One MIT student had done a stellar job with a wearable Solar energy based electricity generator.  He used fragile solar cells that converted 20% of the sunlight that contacted them, providing 18 Watts of power, just under the 20 Watt target.  The innovative part of his research was in the attempt to make the panels rugged enough to survive the beating soldiers put on them.  Another team of Cadets made a strobe light that was only visible through the latest versions of night vision devices.  THe idea was that older versions had fallen into the hands of the enemy.  The strobe was fragile, and one point they stated that was grounds for further research was making it more durable.  The materials work of the Solar panel project would be a great starting point.

Many of the other projects were wprthy of note:

  • a firewall that was capable of blocking Skype
  • A two battery UPS system for the radios, also field chargeable.
  • A spring and  cable based system designed to pull a HMMWV  turret gunner back into the vehicle in case it is about to flip.
  • A Wireless network for a minefield, allowing the friendly forces to turn off the mines to minimize friendly casualties and collateral damage
  • A “Spy Rock”
  • Two different position systems based on things like gyros, accelerometers, and cheap wireless transceivers
  • A radio controlled dirigible with autopilot capable of carrying a 3 pound payload.

The proejcts were judged by a panel with members from industry, academia, and the military.  It was especially good to see two Command Sergents Major in the panel, with a solid understanding of the harsh reality of the life of the soldiers.  One was the CSM of the Infantry School at Fort Benning.  I can’t think of anyone better equipped to say “Good Idea”, “That is too heavy”, or to ask the question “Is that addressing the right problem.”

There were six prizes donated by several  companies, each of several thousand dollars.  The USMA team won the highest award and the overall trophy.  I was really impressed by the creativity and ingenuity of the students, and the quality of the design process they employed.

Echos of Erudition

Mr. Homer, My Ninth grade English teacher once made a point of describing the joy he felt on that day in Spring when you first notice the buds on the trees.  I’d long forgotten  that description until moving back to Massachusetts.

In California, there are always some trees that have leaves.  The winter months there mean rain and a return to lushness from the brown of Summer.

New England is defined by the transition of colors:  orange, gray,  white, gray, green.
Biking to work these past few days has required a quicker set of reflexes to avoid the reemergence of the joggers.  Many exposed legs and arms iterating above the root-knarled path along the Charles.  They wear t-shirts that don’t quite hide the thin layer of Winter insulation that motivates their activity.

The buds are on the trees, and I only noticed yesterday.  Thanks, Mr. Homer

Faking out PAM Authentication

I am working on a server application that uses Pluggable Authentication Modules (PAM) for authentication support.  This application  must run as root. As part of development, people need to log in to this server.  I don’t want to give out the root password of my development machine to people.  My hack was to create a setup in /etc/pam.d/emo-auth that always allows the login to succeed, provided the account exists.  The emo-auth configuration is what the application looks up to authenticate network connections.

$ cat /chroot/etc/pam.d/emo-auth

account   required   pam_permit.so
auth      required pam_permit.so
session   required pam_permit.so

Now people login with root, and any password will allow them to get in.
Since this is only for development, this solution works fine, and does not require any code changes.

Scripting GDB For a stack trace

I got a stack trace generated from an application like this:

Backtrace[0] 0xee8df698 eip 0x87c3b39
Backtrace[1] 0xee8df6e8 eip 0x87c9dbc
Backtrace[2] 0xee8df838 eip 0x85e3f19

And so on. Here’s how I converted it to something useful:

Copy and past the trace into emacs.

Mark the top left corner (ctrl-space)

Move to the last line, right at the end of the eip.

Alt-X kill-rectangle.

This is a great way to do editing by columns in emacs.

Added the words “info symbol to the begging of each line. I did this by first cutting a return chacter, then doing a search and replace, pasting in the cut ‘return’ as the search criteria, andreplaceing it with the ‘return’ followed by “info symbol “. I use this hack a lot to modify the start or end of all the lines in a file.

Once done, I ran

gdb –command=~adyoung/bugs/myapp/backtrace.txt ./myapp core