On Hierarchy

The book “On Intelligence.” is one of the most intriguing I’ve read in a long time. I read it as context to understand Dilpeet George’s thesis which is based around the concept of “Hierarchical temporal memory.” or HTM for short. HTM is a mathematical model of a learning machine based on the organization of the neocortex of the mammalian brain. HTM is a tree, with a complex interface between the nodes. At the bottom of the tree are the sensors: touch, light, sound, smell. At the top it the hippocampus, which seems to have its own rules. The focus in HTM is the nodes between root and leaf.

Each HTM node is responsible for taking its input data and fitting it into a sequence, hence the word ‘Temporal’. It uses Markov chains, which are transition graphs with probabilities to represent the sequences. Basically, a node sees a couple of input patterns, and figures out “Oh, hey, we’re seeing this sequence.” The first pattern it gets might turn up in many Markov chains, but by the time it has seen three or four, it has narrowed down to a pretty small subset of them, ideally one. Since there is a probability involved in a sequence, the node can say “this Markov chain is the highest probability to produce this pattern.”

The next step is to fire the info up the tree to the next node. This node trees multiple subnodes as its input pattern.  If the bottom level of the tree is sub sections of the visual area of the eye, the next might cover the entire eye.  The third level might integrate the two eyes together.  The fourth level might integrate visual information with other senses.  Enough layers, and we end up at the root node, the hippocampus, which has an integrated view of all layers below.  From the hippocampus, predictions  then  down the hierarchy.  From higher to lower, the conversation goes something like:  “I think you are here in this sequence.  Expect to witness ‘this’ next.”

Confusion, then, is when a higher level makes prediction after prediction, and the lower levels respond back with “Nope, didn’t see that, saw this instead.”

Thinking about the hierarchy in the brain leads me to ponder social hierarchies.  like corporate entities, the Army,  the US Government and the body of Linux Kernel developers .

When the eyes start tracking an object in flight, the brain orchestrates a complex dance between visual input, muscle activation, and kinesthetic sense, to keep the object in the center of the vision area.  The animal is actively engaged in “Watching the object.”  The center area of the eye is getting minute deltas from the object, deducing direction of travel.  The muscles adjust the head and eye muscles to account for these deltas.  Upon success, the object returns to the center of the eye, upon failure, a different segment of the eye notices the moving item, and an additional round of corrections take place.

An Army analogy:  An infantry company is responsible for security of a busy market.  The Company commander gives each of the platoons a sector.  Each platoon leader sub divides his sector for his squads.  Each squad leader assigns sectors to the fire teams, and the team leader assigns a sector to individual soldiers.  Once the sectors are set, each person scans his lane, and makes a mental model of the invariants of this sector.  Private Snuffy may be looking down a back alley with no-one in it.  Specialist Kaye might be over watching a booth with two people working it, and a constant flow of customers.  In the case of Private Snuffy, any individual appearing in his sector is significant.  To Specialist Kaye, he has the harder task of deciding which of the customers compose a risk. Both soldiers report to their team leader, Corporal Punishment, any changes that happen in their sector.  For Snuffy, this is usually “No Change.” where as Kaye might be saying “New Customer, Male, white robe”.  Corporal Punishment decides what to report up to his Squad Leader, Sergeant Slaughter, and so forth up the chain of command.

Say Snuffy sees someone pass by his alley way.  He tells Corporal Punishment, who then tells Specialist Kaye “Expect to see a woman in full Burka entering your sector from 9 O’Clock.”  If the Woman never shows, this means she has stopped in an area out of site of the fire team.  This gets reported up higher to the squad leader, who tells fire team B, lead by Corporal Klinger to look for the Woman.  Corporal Klinger passes this down to Private Letters, who either confirms or denies that he sees the woman.  He might have already identified the woman, and told Klinger, in which case Klinger can report this himself, and perhaps that has already happened, and so Sergeant Slaughter can send this information down to Team A.

Meanwhile, a parallel hierarchy is in action.  The CIA is gathering intelligence, the Aviation branch has helicopter’s flying by, Military Intelligence has drones in the air, sensors on the ground, and so forth.  All of this information is fed up to Battalion, Bridage, and possible higher staffs.  It should be obvious from this example that faster local decisions can be made if crosstalk happens at lower levels.  These may not be better decisions, just quicker. The more global the knowledge, the more likely the decision is to be accurate, the more local the knowledge, the more likely it is to be precise.   A good decision making strategy has to balance precision with accuracy.

For a new unit, there is going to be a lot of chatter.  Everything will be new, everything will need to be analyzed, and classified.  As the unit gains familiarity, the level of chatter dies down.  Integrating a new unit into an existing framework will speed up the acclimatization process.

Muscles don’t have access to the photo-receptors of the eye. The same needs to be true of the intelligence analysis process of the US Government. Sharing needs to be at the right level. Not completely compartmentalized, as it was pre-9/11, but not across the borad as the recent WikiLeaks publications have demonstrated. Analysts that are both working on Iraq should be able to share info in Iraq. People that are on Iran should share info on Iran. But this should be at least one level up the hierarchy. There is no way any one person could process all of the dispatches from all around the world on every topic. The proper compartmentalization needs to be a fluid, changing ,but controlled structure. The people designing the structure are not the same reading the reports from the field.

The concept of hierarchy is everywhere in Software.  In a Unix style operating system, the two main abstractions, processes and files, are organized into trees.  The Domain name system and network address  assignments are organized in to hierarchies. So it should come as no surprise that much work has been done for algorithms to handle hierarchical structures.  The basics of tree traversal (in,pre, and post order) are sequential algorithms, but the real power of hierarchy comes from parallelism.  If we were to look at the examples given thus far, most of them evolved to limit the responsibility of a particular decision maker to a reasonable scope, and then to provide a large number of decision makers.  These decision makers are supposed to be able to do their work independently of the other members of the hierarchy at the same level or below them, with the exception of the members directly below them.  Thus if we organize about 1000 elements into a binary tree, and say that each element only has dependencies up and down that tree, we’ve just limited each element to 3 dependencies:  one higher up, and two lower down.  The more the fan out, the more dependencies.  The cost of this reduction in dependency is that there is more overhead in the middle of the tree.  For ~1000 elements in a binary tree, there are as many internal nodes in the tree as there are end nodes.  If only end nodes are getting work done, 1/2 your nodes go to management.

One place where discussion of hierarchy in software development shows up repeatedly is in the flow of patches that become part of the Linux Kernel.  Linus owns the process, and in doing so has set himself at or near the root of the tree.  At times he has been responsible for the “blessed” Linux Kernel, at others, the Unstable merge point of lower levels of the hierarchy.  Surrounding Linus is a series of “Lieutenants” that take responsibility for various subsystems of the Kernel.  This two layered hierarchy has evolved to solve the problem, once stated as “Linus Doesn’t scale.”  No one person can handle the conceptual model required to validate each and every patch to the core kernel. But world wide development of Linux can move forward with this two layered approach.  If you look at pieces of the system, you will probably see more hierarchy in some areas as well.  HP and IBM certainly have infrastructure for handling their contributions to Linux.  Devices fall in to hierarchies of common behavior:  all File Systems are implementations the Virtual File System abstraction. Thus even with a s subsystem, the conceptual models follow a hierarchical approach, limiting the scope of what any particular review has to understand.

More to come on this topic.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.