Highlander Syndrome in Package Management

Somewhere between systems work and application development lies the realm of package management. There are two main schools of thought in package management: inclusive or exclusive. If you are inclusive, you want everything inside a package management system, and everything should be inside one package management system. If you are exclusive, you want the system to provide little more than an operational environment, and you will manage your own applications thank-you-very-much.

One problem with the inclusive approach is, in the attempt to clean up old versions, you often end up with The Highlander Syndrome. There can be only one version of a library or binary installed on your system. The Exclusive approach is more end application focused. I may need to run a different version of Python than is provided by the system, and I don’t want to be locked in to using only the version installed system wide. In fact, I may require several different versions, and each of these require their own approach.

CPAN, Pear, and Maven have provide language specific approaches level APIs to resolving dependencies at the per application level. Maven is particualrly good at providing multiple versions of the API: I errs so far this way that often the same Jar file will exist multiple times in the maven repository, but under different paths.

There should be middle ground for the end user between all or nothing in package managemnt. As a system administrator, I don’t want users running “just any” software on their system, but as an end user I don’t want to be locked in to a specific version of a binary.

If the role of application maintainer is split from the role of system administrator, than the people that fill those two roles may have reason to use a different approach to package management. Since the app developer can’t be trusted, the sys admin doesn’t provide root access. With no root access, the app developer can’t deploy an RPM/Deb/MSI. The app developer doesn’t want the system administrator updating the packages that the app depends on just because there is a new bugfix/feature pack. So, the app developer doesn’t use the libraries provided by the distribution, but instead provides a limited set. Essentially, the system has two administrators, two sets of policy, and two mechanisms for applying that policy.

Each scripting language has its own package management system, but the binary languages tend to use the package management system provide by the operating system. Most Scripting language programmers prefer to work inside their language of choice, so the Perl system is written in perl, the emacs system is written in LISP, the Python one in Python and so on. The Wikipedia article goes into depth on the subject, so I’ll refrain from rewritintg that here.

A Package management system is really a tuple.Â The variables of that system are:

The binary format of the package
The database used to track the state of the system
The mechanism used to fetch packages
The conventions for file placement

There is some redundancy in this list. A file in the package my also be considered a capability, as is the “good name” of the package. A package contain empty sets for some of the items in this list. For example, an administrative package may only specify the code to be executed during install, but may not place any files on a file system. At the other extreme, a package may provide a set of files with no executable code to be run during the install process.

Of these items, it is the conventions that really prevent interoperability. This should come as no surprise: It is always easier to write an adapter on top of an explicit interface than an implicit one. The Linux Standards Base helps, as does the standards guidelines posted by Debian, Red Hat, and other distribution providers. However, if you look at the amount of traffic on the mailing lists regarding “file X is in the wrong place for its type” you can understand why automating a cross package install is tricky. Meta package management schemes attempt to mitigate the problem, but they can really only deal with thing that are in the right place.

Take the placement of 64 bit binaries. For library files, Red Hat has provided a dual system: put 32 bit libriares under /usr/lib and 64 bit libraries under /usr/lib64. Debian puts them all into the same directory, and uses naming to keep them apart. In neither case, however, did they provide a place to make 32 and 64 bit binaries co-exist. How much easier would migration have been if we had /usr/bin32 and /usr/bin64, with a symlink from either into /usr/bin?

Thus we see a couple of the dimensions of the problem. An application should have a good name: web server, mail client, and so on. A system should support multiple things which provide this capability, a reasonable default, and customizability for more advanced users.The system should provide protection against applications with known security holes, but provide for the possibility of multiple implementations released at different points in time.

An interesting take on package management comes from OSGi. It is a language specific package management approach, specifically for Java. It takes advantage of portions of the the Java language to allow the deployment of multiple versions of the same package inside a since Process. When I mentioned this to some old time Linux sys admins, they blanched. OSGi does not specify how to fetch the packages, much like RPM without YUM or DPKG with out APT. OSGi packages are installed into the application. As such, they are much more like shared libraries, with specific code sections run on module load and unload. Different OSGi container provide different sets of rules, but basically the packages must exist inside of a subset of directories in order to be available for activation. I have heard an interesting idea that the JPackage/RPM approach and OSGi should ideally merge in the future. To install a Jar into your OSGi container, you would have to install an RPM.

One additional issue on the Java/RPM front is Maven. Both Maven and RPM want to run the entire build process from start to finish. Both have the concept of a local Database of packages to resolve dependencies. For long term Java/RPM peaceful coexistence, RPM is going to have to treat Maven as a first class citizen, the way that it does make. Maven should provide a means to generate a spec file that has the absolute minimum in it to track dependencies, and to kick off an RPM build of the Maven artifacts.

Adam Young's Web Log

The Notebook of a Programmer Climber Musician Ex-Soldier Woodworker and a few other things

Highlander Syndrome in Package Management

Leave a Reply