There are several competing technologies to handle hardware management. I say hardware management because, while they also do software, that is not what they are all about. The three technologies are the Simple Network Management Protocol (SNMP), Intelligent Power Management Interface (IPMI) and Web Based Enterprise Management (WBEM). Yes, there are certainly more technologies out there that are related to these technologies, and that may fill comparable roles, but these three seem to be the ones that control the center right now, each with a separate set of strengths, weaknesses, and proponents.
These three technologies each attempt to provide a unified approach to handling the monitoring and control of software. As such they each attempt to provide a standard object model of the underlying components. SNMP and WBEM both provide a standard file format for specifying the meta data of the components they control and a standard network protocol for remote access. IPMI provides a standard view of components without interface file format.
Solutions for managing a hardware system have to solve four problems: persistent object references, property set queries and changes, remote method invocation, and asynchronous event monitoring. In order to monitor or change a component in the system, you first need to be able to find that component.
Of the three, SNMP is by far the oldest and most established. It has the benefit of being defined primarily by the fact that it is a network protocol. Of course, there are numerous version of SNMP, as it has evolved through the years, and so some of the more recent additions are less well accepted and tested. The biggest thing that SNMP has in its favor is that it is defined strictly as a wire protocol, providing the highest degree of interoperability, at least in theory. Of course, HTTP is defined strictly as a wire protocol and we have all seen the incompatibility issues between Netscape and IE. However, the wider array of software tools that any given piece of hardware has to work with means that people code conservatively. Thus interoperability is high, at the cost that people code to the lowest common denominator of the spec, and use primarily the best tested features. By far the most common use of SNMP I have encountered has been for devices sending out status updates. There are various tools for monitoring these updates, consolidating them, and reporting the health of a distributed system. At Penguin we put some effort into supporting Ganglia and Nagios, both of which provide some SNMP support.
I’ve had a love/hate relationship with IPMI for the past couple of years. My earliest exposure to IPMI was dealing with power cycling machines that were running the Linux Kernel. In theory, all I should have to do was to enable the LAN interface on the machine, and I could use ipmitool to reboot the machine like this:
/usr/bin/ipmitool -I lan -U root -H 10.1.1.100 -a chassis power cycle
IPMI was implemented on the motherboard of the machine, and listened to the same network port that was used during normal operations. When the Linux kernel crashed, the port did not respond to IPMI packets. It turned out the network interface was blindly sending all packets to the Linux kernel, regardless of the kernel’s state. The solution was to implement a heartbeat, which required a later version of the Linux Kernel than we were capable of supporting at that time. So IPMI was useless to me.
Well, not completely. The other thing that IPMI supports is called serial over LAN. THe unfortunate acronym for this is SOL. SOL is a way of connecting to the console of a machine via the network interface. Unlike a telnet session, this session is not managed by any of the network daemons. Also, fo us, it allowed us to view the boot messages of a machine. It was a pain to set up, but it kept us from having to find doubly terminated serial cables and spare laptops in order to view a machines status.
Much of my current work is defined by WBEM. I was first exposed to this technology while contracting at Sun Microsystems. We were building configuration tools for online storage arrays. I was on the client side team, but was working on middleware, not the user interface. Just as SNMP allowed you to query the state of something on the network, WBEM had the concept of objects, properties, and requesting the values of a set of properties in bulk across the network. My job was to provide a simple interface to these objects to the business object developers. Layers upon layers. Just like a cake. There was another team of people who worked directly for Sun that were developing the WBEM code on the far side of the wire (called Providers in WBEM speak). WBEM provides the flexibility to set all or the properties, a single property, or any subset in between. The provider developers used this mechanism to set related properties at once. The result was an implicit interface: If you set P1, you must set p2. This is bogus, error prone, and really just plain wrong. My solution was to fetch all of the properties, cache them, and then set them all each time.
WBEM requires a broker, a daemon that listens for network requests and provides a process space for the providers. There are two main open source projects that provide this broker. The first is tog-pegasus, which comes installed with Red Hat Enterprise Linux. The second is Open WBEM, which comes with various versions of SuSE Linux from Novell. However, since WBEM is trying to get into the same space that SNMP currently owns, there has been a demand for a lighter weight version for embedded and small scale deployments. Thus the third project, the small footprint CIM broker or SFCB) which is part of SBLIM.