Java Web Applications in Fedora

Fedora and Debian play the role where many chaotic projects get a degree of charm school: they learn to play nice with a lot of other projects. In Fedora, as near as I can tell, there is only one Java based web application packages as part of the distribution: Dogtag, the Public Key Infrastructure server. As we look at how PKI should look in the future, the dearth of comparable applications packaged for Fedora leaves us with the opportunity for defining a logical and simple standard packing scheme. While I am not there yet, this post is the start of my attempts to organize my thoughts on the subject. I’m looking for input.

Continue reading

Finding Java Classes

I’m back on a Java project. Been a while, and I want to capture some of the tricks I’m using.

Right now, I’m just trying to import the project into eclipse.  Seems that the current team members don’t use it.  I’m an IDE kind of guy, at least when it comes to Java.

Building the .classpath file can be tricky.  However, since I know that I have a good build, and that this project it a good participant in the Fedora build process, I have the advantage of knowing that my packages reside in /usr/share/java.  Still, all eclipse gives me is a set of classes that it can’t find.  how to find them?

This project uses CMake.  I could look for all of the Jar files in the CMakeLists.txt files, and I might do that in the future.  However, a trick I’ve developed in the past has come in handy.

 

class2path(){
echo $1 | sed 's!\.!\/!g'
}

JDIR=/usr/share/java

make_alljars(){
for JAR in `find /usr/share/java -name \*.jar -type f ` 
	do for CLASS in `jar -tf $JAR |  grep \.class` 
		do echo $JAR $CLASS  
	done 
done > /tmp/alljars.txt
}

 

First, the make_alljars function creates a map in (value key) order. The value is the Jar file name, and the key is the class name. To fine a Jar file that contains a given class (in this example netscape.ldap.LDAPConnection) , run:

 

grep `class2path  netscape.ldap.LDAPConnection` /tmp/alljars.txt

And the output is

/usr/share/java/ldapjdk.jar netscape/ldap/LDAPConnection$ResponseControls.class
/usr/share/java/ldapjdk.jar netscape/ldap/LDAPConnection.class


This works really well with eclipse, in that the error messages have the name of the class. You can then just highlight the class name, paste it into the command line in place of the class I have above, and when you get the Jar file name, you can highlight to save to the clipboard. From The right click context menu pick Java Build Path and then Add External Archive and then paste the whole path in.

What Jar File

When working a with a new project, I often find I am searching for the Jar files that fulfill a dependency. Sometimes they come from maven, sometimes from the Fedora RPMS. My approach has been to make a cache of the Jar files in the directories that I care about that contains a map from jar file name to class name:

#!/bin/bash

CACHE_FILE=/tmp/jarcache

echo > $CACHE_FILE

for DIR in /usr/share/java /usr/lib/java
do
    for JAR in `find $DIR -name \*.jar`
    do
        #only do the non-symlinked versions
        if [ -f $JAR ]
        then
            for CLASS_FILE in `jar -tf $JAR | grep \.class`
            do
                CLASS=`echo $CLASS_FILE | sed 's!/!.!g'`
                echo $JAR $CLASS >> $CACHE_FILE
            done
        fi
    done
done

Then call it this way:

grep "org.mozilla.jss.ssl" /tmp/jarcache

Removing empy comment blocks

Eclipse can automate a lot of stuff for you. One thig is did for me was automating the serialVersionId generation for all the serializable classes in my tree.
They look like this:

    /**
     *
     */
     private static final long serialVersionUID = -9031744976450947933L;

However, it put an empty block comment in on top of them, something I didn’t notice until I had mixed in this commit with another. So, I want to remove those empty comment blocks.

#!/bin/bash

for JAVAFILE in `find . -name \*.java`
do
     sed -n '1h;1!H;${;g;s! */\*\*\n *\* *\n *\*/ *\n!!g;p;}' \
         < $JAVAFILE > $JAVAFILE.new
     mv $JAVAFILE.new $JAVAFILE
done

Thanks to this article for how to do the multiline search and replace.
http://austinmatzko.com/2008/04/26/sed-multi-line-search-and-replace/

Binary Java Runner

I’ve been recently thinking about the scripts that are used to run Java processes. Firing a script interpreter and running a slew of processes before running your program seems really inefficient. Additionally, programs meant to run on a Fedora system can make a few assumptions that general Java programs can’t.

Jar files should be in /usr/share/java or /usr/lib/java.We don’t want people putting files all over the place. Most of the scripts are doing classpath related operations.

Here is my Sample code

class Hello{
    public static void main(String[] args){
        for (int i =0 ; i < args.length; i++){
            System.out.println ("Hello "+args[i]);
        }
    }
}

Which I can compile into a Jar file pretty easily. I'll cll the jar hello.jar

int main()
{

char *const argv[] = {"/usr/bin/java","-cp","hello.jar","Hello"};
char *newenviron[] = { NULL };

execve(argv[0], argv,newenviron);
perror("execve");
return 0;
}

This serves as a starting point.

But, really, do we want to have to ship a Jar file and a separate executable? For command line utilities, the basic classes that it requires should be statically bound into the application. We can convert the Jar file into data an link it into the application:

ld -r -b binary -o ant-launcher.o /usr/share/java/hello.jar
cc     runjava.c hello.o   -o runjava

Now is where things get interesting.

First, we need a way to load this Embedded Jar file in as a class. As a proof of concept, the C program can read the data section and spit it out to a jar file.


const char * temp_file_template = "hello.jar.XXXXXX";
char * temp_file = "hello.jar.XXXXXX";

void build_jar(){
  temp_file = malloc(strlen(temp_file_template)+1);
  memset(temp_file, 0, strlen(temp_file_template)+1);
  strcat(temp_file,  temp_file_template);

  int retval  = mkstemp(temp_file);

  int fd = open(temp_file,
                O_WRONLY | O_CREAT |  O_TRUNC
                ,S_IRWXU  );
  if (fd < 0){
    perror("main");
    return;
  }
  ssize_t sz =
    write (
           fd,
           &_binary_hello_jar_start,
           (ssize_t)&_binary_hello_jar_size);

  if (sz < 0){
    perror("Write");
  }

  close(fd);
}



This produces the same output as if we had run it from the initial example.

But, this is wasteful. This example will litter a copy of the jar file into your directory. You can't delete it when you are done; exec does not return to your code.

So the next step is, instead of running using exec, run using JNI.

/*run the executable using an embedded JVM*/
void run_embedded(){
  JNIEnv          *env;
  JavaVM          *jvm;
  JavaVMInitArgs  vm_args;
  JavaVMOption    options[2];
  int             option_count;
  jint            res;
  jclass          main_class;
  jobject         javaGUI;
  jmethodID       main_method;
  jthrowable exception;
  jobjectArray ret;
  int i;

  const char * classpath_prefix = "-Djava.class.path=";
  int classpath_option_length =
    strlen(classpath_prefix) + strlen(temp_file) + 1;

  char * classpath_option =
    malloc(classpath_option_length);

  memset(classpath_option, 0, classpath_option_length);

  strcat(classpath_option, classpath_prefix);
  strcat(classpath_option, temp_file);

  option_count=1; //set to 2 for jni debugging
  options[0].optionString = classpath_option;

  options[1].optionString = "-verbose:jni";

  /* Specifies the JNIversion used */
  vm_args.version  = JNI_VERSION_1_6;
  vm_args.options  = options;
  vm_args.nOptions = option_count;
  /* JNI won'tcomplain about unrecognized options */
  vm_args.ignoreUnrecognized = JNI_TRUE;
  res = JNI_CreateJavaVM(&jvm,(void **)&env,&vm_args);
  free(classpath_option);

  if (res < 0 ){
    exit(res);
  }
  const char * classname =   "Hello";
  main_class   = (*env)->FindClass(env, classname);
  if (check_error(main_class,env)) {
      return;
    }

  main_method = (*env)->GetStaticMethodID(env,main_class,"main",
                               "([Ljava/lang/String;)V");

  if (check_error(main_method,env)) {
    return;
  }

  char *message[5]= { "first", "second", "third", "fourth", "fifth"};

  jstring blank = (*env)->NewStringUTF(env, "");

  jobjectArray arg_array = (jobjectArray)(*env)->NewObjectArray
    (env,5,
     (*env)->FindClass(env,"java/lang/String"),
     blank);
  if (check_error(arg_array,env)) {
    return;
  }

  for(i=0;i<5;i++) {
    jstring str = (*env)->NewStringUTF(env, message[i]);
    (*env)->SetObjectArrayElement
      ( env,arg_array,i,str);
  }

  jobject result =
    (*env)->CallStaticObjectMethod(env, main_class, main_method, arg_array );

  if (check_error(result,env)) {
    return;
  }
}

Note that we have now linked against a version of the JVM, and that commits us to a given JDK. This is just a proof-of-concept; This whole copy the Jar approach is just a stepping stone.

What we really want is a classloader which plays by the following rules:

  • lets the rt.jar classloader safely load the basic java, javax, and other JRE classes as a normal executable
  • makes sure that all the classes for our application get loaded from our specified jar file
  • reads our specified jar file directly out of the elf section of our binary executable
    • I appears to me that the standard classloading approach is fine for our needs with one exception: we can't treat an ELF file as if it were a Jar file. If we could do that, we could throw argv[0] on the front of the -Djava.class.path command line switch and be off and running. This would have the added benefit of creating a classloader extension that could be used from other Java programs as well. I'm currently thinking it should be of the form: file+elf://path-to-file/linked_jarfile_name

      as that format would allow us to link in multiple jar files.

      However, to be really efficient, we don't want to have to pay the price for the OS file operations (open, read) again, since the jar file is already loaded into memory. We should be able to create a protocol which tells the classloader: use this InputStream to load the class, and the JNI code can then pass the input stream.

Guice gets it

I’m currently working on Candlepin, a Java web service application that uses Google Guice for dependency injection. I’ve long suspected that I would like  Guice. I’m not a big fan of annotations, but I’ll admit that there are something that you can’t do in Java using other approaches.   Guice makes appropriate use of annotations to make dependency injection work in a type safe an intelligent manner.

I had to break a dependency in order to make a class testable.  Here’s the old interface:

@Inject
public JavascriptEnforcer(
     DateSource dateSource,
     RulesCurator rulesCurator,
     ProductServiceAdapter prodAdapter );

The problem with this is that RulesCurator is a database access class, which means I need a DB for my unit test.   What the JavascriptEnforcer specifically needs is the product of the database query.

Here’s a piece of code from the body of the constructor hat I also want to remove.

    
    ScriptEngineManager mgr = new ScriptEngineManager();
    jsEngine = mgr.getEngineByName("JavaScript");

This links us to one creation strategy.  I want to use a different one in  my Unit test.

My first step was to create a new constructor, and have all of the code I want to extract in the old constructor.  This would probably have been sufficient for the unit tests, although it might risk a class not found for the DB stuff.  Here’s the new constructor interface:

public JavascriptEnforcer(
    DateSource dateSource,
    Reader rulesReader,
    ProductServiceAdapter prodAdapter,
    ScriptEngine jsEngine) ;

Here’s where Guice shined.  I need two custom compnents here, one which fulfills the rulesReader dependency, the other which fulfils the jEngine.  The jsEngine one is easier, and I’ll show that.  First, create a custom Provider.  A Provider is a factory.  For the jsEngine, that factory just looks like this:

package org.fedoraproject.candlepin.guice;
import javax.script.ScriptEngine;
import javax.script.ScriptEngineManager;
import com.google.inject.Provider;
public class ScriptEngineProvider implements Provider<ScriptEngine> {
    public ScriptEngine get() {
        return new ScriptEngineManager()
            .getEngineByName("JavaScript");
    }
}

Which we then register with our custom module:

class DefaultConfig extends AbstractModule {
     @Override
     public void configure() {
     ...
        bind(ScriptEngine.class)
            .toProvider(ScriptEngineProvider.class);
    }
}

Now comes the interesting one.  We want a Reader.  But a Reader is a very common class, and we don’t want to force everything that needs a Reader to use the same creation strategy.  Here is where Guice uses Annoations.

public class RulesReaderProvider implements Provider<Reader> {
    private RulesCurator rulesCurator;
    @Inject
    public RulesReaderProvider(RulesCurator rulesCurator) {
        super();
        this.rulesCurator = rulesCurator;
    }
    public Reader get() {
        return new StringReader(rulesCurator.getRules().getRules());
    }
}

Note how the Provider itself is a component.  This allows us to use another component, the RulesCurator, without creating a direct dependency between the two classes.  Still, this does not distinguish one reader from another. That happens with another bind call.

public void configure() {
    ...
    bind(ScriptEngine.class)
        .toProvider(ScriptEngineProvider.class);
    bind(Reader.class)
        .annotatedWith(Names.named("RulesReader"))
        .toProvider(RulesReaderProvider.class);
}

Then, the inject Annotation for our Enforcer looks like this:

    
    @Inject
    public JavascriptEnforcer(
        DateSource dateSource, 
        @Named("RulesReader") Reader rulesReader, 
        [...] 
       ScriptEngine jsEngine)

The key here is that the @Named matches between the two components.