Debugging Techniques

Always look at the top error message in the spew that comes out from the compiler. This is usually where the problem is triggered. Junk after that is often spurious, especially if the first problem is a syntax error.

Use a revision control system like Git or Mercurial, and make a checkpoint of your code when ever you have made a significant addition. This way, you don’t get into the situation where you had it working, you broke it, and you can’t get it running again. Git has a little bit of a learning curve, but it rocks. If you are comfortable posting the code where other people can see, you can push your changes to github, and then if you have a catastrophic machine failure, you don’t lose your work.

The binary search is your friend. If you know that the error is somewhere between lines 10 and 20, Comment out lines 15-20 and see if you still have the problem. If so, comment out lines 13 and 14, and so on. A binary search means that you can search 1000 lines of code in 10 comparisons, and save yourself a metric shit-ton of time.  A metric shit-ton is an industry term.

Step through your code in a debugger, line by line, to make sure you know what it is actually doing, not what you think it is doing. Very illuminating. If you don’t have access to a debugger, make liberal use of trace statements. In gcc I often use:

#define TRACE() printf (“%s:%s:%d\n”,__FILE__,__FUNCTION__,__LINE__)

There is something comparable in most languages.

Always work from success. When I start a new C program, I start from

#include
int main(){ printf (“Hello, World.”);}

And compile and run between each minor change.

Don’t be afraid to ask for help. A second set of eyes will often quickly see what you missed. Even more so, when you explain the code to someone else, you often solve the problem yourself. This is known as “verbal debugging.”

Candlepin: Metaphor for an Entitlement System

The planning meeting was held in Massachusetts. When brainstorming project names, someone mentioned that most New England of activities: Candlepin Bowling. Thus, the project is named Candlepin.

When describing a project, especially something fairly abstract like an entitlement system, you can clarify communication by using a strong metaphor for the system. So, to explain entitlements, I am going to use a bowling alley as my metaphor.

One way to think of an entitlement is this:

An entitlement is contract that you can hook up your computer system to my content stream.

But for our metaphor  I’m going to say:

An entitlement is kinda like getting a lane a bowling alley.

To which you say:

Huh?

Think about it.  When you go bowling, you pay money, but you don’t get a good, and you don’t get a service.  What you get is access to a resource for a limited time.  Say a small company wants to do a team building activity:

We’re going bowling!

This company has 18 employees.  So, we go over to Westgate Lanes (A nod to the local Candlepin Alley of my childhood.  Indulge me) and we walk to the main desk.  We’ve self organizaed ourselves into six teams of three people each.  We get our shoes, and our group gets three lanes assigned to us.  We go, and each team pairs up with another team, the two teams select a lane from the three available, and they bowl.  After each game, the teams re-shuffle the match ups, switch lanes and  play another game.  When each team has played against all the other teams, we return our shoes and go home.

Here is how the analogy maps to entitlement management.

The Data Center is the Bowling Alley.

The lanes are the physical machines that the virtual machines will run on.

The company is still the company paying the bills.

The front desk is the assignment system where you buy slices of time on the machines of the data center.

The three lanes that our company is assigned has a communication network due to the fact that we all need to coordinate our games.  This is the VPN and VLAN setup that lets you specify a cluster of machines can all work together.

The pin setter and the ball retrieval and the scoring projector are analogous to the resources required to run the programs.

The score card is the backing store for the database instance that your applications talk to.

We can extend the metaphor to a larger world, too.  Say we have a bowling league that spans multiple towns and multiple bowling alleys.  This league is composed of teams.  The league sets the schedule, the games are played at the various alleys through out the district.  At the end of the season, the lead team from our league actually plays against the lead team from another league.

This reflects the hierarchical structure of resource management.  You can see that the bowling alley doesn’t really care about leagues except as a way to generate traffic through the alleys.  From the Alley’s perspective, the league is just another customer, paying for lane time.  Perhaps in some cases, the league pays for the time, in others, the individual teams do.  Authority to use a specific lane may have to be cleared not only through the clerk at the desk of the alley, but through the league official that is managing a tournament.  Just like if my company buys a chunk of virtual machines on a cloud somewhere, and then delegates them for internal usage.

Note that the metaphor works for internal clouds as well.  At the Really Big Company (RBC) campus, they take their bowling so seriously that they have a series of lanes installed into a building on their campus.  Now, the scheduling and resource management have been brought in house, but the rest of the rules still apply.

Guice gets it

I’m currently working on Candlepin, a Java web service application that uses Google Guice for dependency injection. I’ve long suspected that I would like  Guice. I’m not a big fan of annotations, but I’ll admit that there are something that you can’t do in Java using other approaches.   Guice makes appropriate use of annotations to make dependency injection work in a type safe an intelligent manner.

I had to break a dependency in order to make a class testable.  Here’s the old interface:

@Inject
public JavascriptEnforcer(
     DateSource dateSource,
     RulesCurator rulesCurator,
     ProductServiceAdapter prodAdapter );

The problem with this is that RulesCurator is a database access class, which means I need a DB for my unit test.   What the JavascriptEnforcer specifically needs is the product of the database query.

Here’s a piece of code from the body of the constructor hat I also want to remove.

    
    ScriptEngineManager mgr = new ScriptEngineManager();
    jsEngine = mgr.getEngineByName("JavaScript");

This links us to one creation strategy.  I want to use a different one in  my Unit test.

My first step was to create a new constructor, and have all of the code I want to extract in the old constructor.  This would probably have been sufficient for the unit tests, although it might risk a class not found for the DB stuff.  Here’s the new constructor interface:

public JavascriptEnforcer(
    DateSource dateSource,
    Reader rulesReader,
    ProductServiceAdapter prodAdapter,
    ScriptEngine jsEngine) ;

Here’s where Guice shined.  I need two custom compnents here, one which fulfills the rulesReader dependency, the other which fulfils the jEngine.  The jsEngine one is easier, and I’ll show that.  First, create a custom Provider.  A Provider is a factory.  For the jsEngine, that factory just looks like this:

package org.fedoraproject.candlepin.guice;
import javax.script.ScriptEngine;
import javax.script.ScriptEngineManager;
import com.google.inject.Provider;
public class ScriptEngineProvider implements Provider<ScriptEngine> {
    public ScriptEngine get() {
        return new ScriptEngineManager()
            .getEngineByName("JavaScript");
    }
}

Which we then register with our custom module:

class DefaultConfig extends AbstractModule {
     @Override
     public void configure() {
     ...
        bind(ScriptEngine.class)
            .toProvider(ScriptEngineProvider.class);
    }
}

Now comes the interesting one.  We want a Reader.  But a Reader is a very common class, and we don’t want to force everything that needs a Reader to use the same creation strategy.  Here is where Guice uses Annoations.

public class RulesReaderProvider implements Provider<Reader> {
    private RulesCurator rulesCurator;
    @Inject
    public RulesReaderProvider(RulesCurator rulesCurator) {
        super();
        this.rulesCurator = rulesCurator;
    }
    public Reader get() {
        return new StringReader(rulesCurator.getRules().getRules());
    }
}

Note how the Provider itself is a component.  This allows us to use another component, the RulesCurator, without creating a direct dependency between the two classes.  Still, this does not distinguish one reader from another. That happens with another bind call.

public void configure() {
    ...
    bind(ScriptEngine.class)
        .toProvider(ScriptEngineProvider.class);
    bind(Reader.class)
        .annotatedWith(Names.named("RulesReader"))
        .toProvider(RulesReaderProvider.class);
}

Then, the inject Annotation for our Enforcer looks like this:

    
    @Inject
    public JavascriptEnforcer(
        DateSource dateSource, 
        @Named("RulesReader") Reader rulesReader, 
        [...] 
       ScriptEngine jsEngine)

The key here is that the @Named matches between the two components.

Could Maven use a single directory for archives.

Maven is too important a part of too many projects for most Java developers to ignore. However, some of the decisions made in building with Maven are suspect, mostly the blind download of binary files from a remote repository. While Maven gets more and more Open Source clean, there are still issues, and the biggest is building Maven itself. Both Debian and Fedora have fairly old versions of Maven, in the range of 2.0.7 as of this writing. Considering that the GA is 2.2.0 and There is work on 3.0, we risk a pretty serious divide in the Open Source Java world of we don’t keep up with Maven, and get a clean way to build it.

Continue reading

My “Two Main Problems With Java” Rant

This is not an Anti-Java rant  Per Se.  It is a rant about the two main things missing from the language that force people into code heavy work-arounds.

Java has two flaws that hurt programmers using the language.  The first is that the reflection API does not provide the parameter names for a function.  The second is that Java allows null pointers.  This article explains why these two flaws are the impetus for many of the workarounds that require a lot of coding to do simple things.  This added complexity in turn leads to code that is harder to maintain and less performant.

Continue reading

The overhead of Java

A programming language is a tool. When choosing the right tool for the job, you want to have good information about it. I’ve worked with both C and Java, and have dealt with a lot of misconceptions about both. I’m going to try and generate some data to use in helping guide discussions about the different languages. Consider this, then as the next instalment of my comparison of programming languages that I started in my IPv6 days.

Continue reading

Code Review Checklist

What follows is the results of a brainstorming session on items that should be in a code review checklist.  As you can see, it needs refining and grouping.  Please feel free to add comments with any items you think should be on it, with any organizational approaches, or any criticism.  Right now, I want to focus on inclusive instead of exclusive, so please don’t recommend removing things:  that willl happen later.

Continue reading