Keeping Development Code Current

Embracing change is hard.  Accepting criticism on code you worked so hard to prepare for review can be hard on the ego.  But when you have additional work that is underway that depends on submissions undergoing review, it can also be a challenge to your organizational skills.  I’ve recently learned a trick about  git that makes this easier in the context of Open Stack development.

Continue reading

Git and SVN for PKI

I’ve been working with the PKI/Dogtag code for a while. Over the past couple years, I’ve been more and more comfortable with Git. PKI uses SVN as a centralized Repository. Since Git SVN integration is fairly mature, I’ve been using that to manage my coding. On Monday, I gave a presentation to my team on Git SVN.  I’ve taken the outline from the slides and included it here.
Continue reading

Using git for distributed development

As of today we are in early stages of development on a new UI approach for FreeIPA.  Since the “bits are very fresh” we want to keep from breaking the existing code base.  The policy for the upstream Repo for FreeIPA is that code must pass a fairly strenuous review process before getting checked in, and our code isn’t that way yet.  However, there are two of us  doing UI development, and we need to share our code back and forth prior to checking it in to the main repo.

This is a situation where git shines, and really can redefine your approach to development.  I realize I am a late comer to git.  I’m a bit of a Luddite, and slow to pick up on technologies in general, so this should come as no surprise.  Here’s our approach.

Continue reading

Highlander Syndrome in Package Management

Somewhere between systems work and application development lies the realm of package management. There are two main schools of thought in package management: inclusive of exclusive. If you are inclusive, you want everything inside a package management system, and everything should be inside one package management system. If you are exclusive, you want the system to provide little more than an operational environment, and you will manage your own applications thank-you-very-much.

One problem with the inclusive approach is, in the attempt to clean up old versions, you often end up with The Highlander Syndrome. There can be only one version of a library or binary installed on your system. The Exclusive approach is more end application focused. I may need to run a different version of Python than is provided by the system, and I don’t want to be locked in to using only the version installed system wide. In fact, I may require several different versions, and each of these require their own approach.

CPAN, Pear, and Maven have provide language specific approaches level APIs to resolving dependencies at the per application level. Maven is particualrly good at providing multiple versions of the API: I errs so far this way that often the same Jar file will exist multiple times in the maven repository, but under different paths.

There should be middle ground for the end user between all or nothing in package managemnt. As a system administrator, I don’t want users running “just any” software on their system, but as an end user I don’t want to be locked in to a specific version of a binary.

If the role of application maintainer is split from the role of system administrator, than the people that fill those two roles may have reason to use a different approach to package management. Since the app developer can’t be trusted, the sys admin doesn’t provide root access. With no root access, the app developer can’t deploy an RPM/Deb/MSI. The app developer doesn’t want the system administrator updating the packages that the app depends on just because there is a new bugfix/feature pack. So, the app developer doesn’t use the libraries provided by the distribution, but instead provides a limited set. Essentially, the system has two administrators, two sets of policy, and two mechanisms for applying that policy.

Each scripting language has its own package management system, but the binary languages tend to use the package management system provide by the operating system.  Most Scripting language programmers prefer to work inside their language of choice, so the Perl system is written in perl, the emacs system is written in LISP, the Python one in Python and so on.  The Wikipedia article goes into depth on the subject, so I’ll refrain from rewritintg that here.

A Package management system is really a tuple.  The variables of that system are:

  • The binary format of the package
  • The database used to track the state of the system
  • The mechanism used to fetch packages
  • The conventions for file placement

There is some redundancy in this list.  A file in the package my also be considered a capability, as is the “good name” of the package.  A package contain empty sets for some of the items in this list.  For example, an administrative package may only specify the code to be executed during install, but may not place any files on a file system.  At the other extreme, a package may provide a set of files with no executable code to be run during the install process.

Of these items, it is the conventions that really prevent interoperability.  This should come as no surprise:  It is always easier to write an adapter on top of an explicit interface than an implicit one.  The Linux Standards Base helps, as does the standards guidelines posted by Debian, Red Hat, and other distribution providers.  However, if you look at the amount of traffic on the mailing lists regarding “file X is in the wrong place for its type” you can understand why automating a cross package install is tricky.  Meta package management schemes attempt to mitigate the problem, but they can really only deal with thing that are in the right place.

Take the placement of 64 bit binaries.  For library files, Red Hat has provided a dual system:  put 32 bit libriares under /usr/lib and 64 bit librareis under /usr/lib64.  Debian puts them all into the same directory, and uses naming to keep them apart.  In neither case, however, did they provide a place to make 32 and 64 bit binaries co-exist. How much easier would migration have been if we had /usr/bin32 and /usr/bin64, with a symlink from either into /usr/bin?

Thus we see a couple of the dimensions of the problem.  An application should have a good name:  web server, mail client,  and so on.  A system should support multiple things which provide this capability, a reasonable default, and customizability for more advanced users.The system should provide protection against  applications with known security holes, but provide for the possibility of multiple implementations released at different points in time.

    An interesting take on package management comes from OSGi.  It is a language specific package management approach, specifically for Java.  It takes advantage of portions of the the Java language to allow the deployment of multiple versions of the same package inside a since Process.  When I mentioned this to some old time Linux sys admins, they blanched.  OSGi does not specify how to fetch the packages, much like RPM without YUM or DPKG  with out APT.  OSGi packages are installed into the application.  As such, they are much more like shared libraries, with specific code sections run on module load and unload.  Different OSGi container provide different sets of rules, but basically the packages must exist inside of a subset of directories in order to be available for activation.  I have heard an interesting idea that the JPackage/RPM approach and OSGi should ideally merge in the future.  To install a Jar into your OSGi container, you would have to install an RPM.

    One additional issue on the Java/RPM front is Maven.  Both Maven and RPM want to run the entire build process from start to finish.  Both have the concept of a local Database of packages to resolve dependencies.  For long term Java/RPM peaceful coexistence, RPM is going to have to treat Maven as a first class citizen, the way that it does make.  Maven should provide a means to generate a spec file that has the absolute minimum in it to track dependencies, and to kick off an RPM build of the Maven artifacts.

    cvs to perforce translations

    At my current company I have to use perforce.  I’ve used cvs and subversion recently, and I tend to think about revision control in terms of thsose.  So here is my start of a cvs to perforce dictionary.  This post will be edited as I learn more:

    The format is

    cvs command : perforce command  — comment

    • cvs diff : p4 diff  — hey some things are easy
    • cvs commit : p4 submit  — note that the p4 version tends to be done with a change log
    • cvs commit : p4 resolve —when you submit a file, the server compares the version you checked out to what is in the repo.  If changes have come in since you checked out, perforce will prompt you to merge them by hand.  cvs commit does this automatically, and will report any failures in automatically merging.
    • cvs status : p4 opened  — this shows only the files that are opened but that were already added to the repo. cvs status has a lot more info than this.
    • cvs update : p4 sync
    • cvs blame:  p4 filelog  — not really a direct translation, but a starting point.  filelog is really more like cvs history.

    Key perforce commands that have no cvs analogues:

    p4 change –creates a change set, a subset of the files in the current directory that will be commited together.   Thus a p4 submit -c <changlist#> could only be reproduced in cvs by somehow generating a list of a files you wanted to revision control together.  CVS does not tend to be used that way.

    p4 client — perforce controls everything on a centralized server.  It also requires you to explicitly check out a file before editing.  You must create a client to get any work done.  the p4 client command both creates a client, and allows you to name it. domain dns .