I’ve been recently thinking about the scripts that are used to run Java processes. Firing a script interpreter and running a slew of processes before running your program seems really inefficient. Additionally, programs meant to run on a Fedora system can make a few assumptions that general Java programs can’t.
Jar files should be in /usr/share/java or /usr/lib/java.We don’t want people putting files all over the place. Most of the scripts are doing classpath related operations.
Here is my Sample code
class Hello{ public static void main(String[] args){ for (int i =0 ; i < args.length; i++){ System.out.println ("Hello "+args[i]); } } }
Which I can compile into a Jar file pretty easily. I'll cll the jar hello.jar
int main() { char *const argv[] = {"/usr/bin/java","-cp","hello.jar","Hello"}; char *newenviron[] = { NULL }; execve(argv[0], argv,newenviron); perror("execve"); return 0; }
This serves as a starting point.
But, really, do we want to have to ship a Jar file and a separate executable? For command line utilities, the basic classes that it requires should be statically bound into the application. We can convert the Jar file into data an link it into the application:
ld -r -b binary -o ant-launcher.o /usr/share/java/hello.jar cc runjava.c hello.o -o runjava
Now is where things get interesting.
First, we need a way to load this Embedded Jar file in as a class. As a proof of concept, the C program can read the data section and spit it out to a jar file.
const char * temp_file_template = "hello.jar.XXXXXX"; char * temp_file = "hello.jar.XXXXXX"; void build_jar(){ temp_file = malloc(strlen(temp_file_template)+1); memset(temp_file, 0, strlen(temp_file_template)+1); strcat(temp_file, temp_file_template); int retval = mkstemp(temp_file); int fd = open(temp_file, O_WRONLY | O_CREAT | O_TRUNC ,S_IRWXU ); if (fd < 0){ perror("main"); return; } ssize_t sz = write ( fd, &_binary_hello_jar_start, (ssize_t)&_binary_hello_jar_size); if (sz < 0){ perror("Write"); } close(fd); }
This produces the same output as if we had run it from the initial example.
But, this is wasteful. This example will litter a copy of the jar file into your directory. You can't delete it when you are done; exec does not return to your code.
So the next step is, instead of running using exec, run using JNI.
/*run the executable using an embedded JVM*/ void run_embedded(){ JNIEnv *env; JavaVM *jvm; JavaVMInitArgs vm_args; JavaVMOption options[2]; int option_count; jint res; jclass main_class; jobject javaGUI; jmethodID main_method; jthrowable exception; jobjectArray ret; int i; const char * classpath_prefix = "-Djava.class.path="; int classpath_option_length = strlen(classpath_prefix) + strlen(temp_file) + 1; char * classpath_option = malloc(classpath_option_length); memset(classpath_option, 0, classpath_option_length); strcat(classpath_option, classpath_prefix); strcat(classpath_option, temp_file); option_count=1; //set to 2 for jni debugging options[0].optionString = classpath_option; options[1].optionString = "-verbose:jni"; /* Specifies the JNIversion used */ vm_args.version = JNI_VERSION_1_6; vm_args.options = options; vm_args.nOptions = option_count; /* JNI won'tcomplain about unrecognized options */ vm_args.ignoreUnrecognized = JNI_TRUE; res = JNI_CreateJavaVM(&jvm,(void **)&env,&vm_args); free(classpath_option); if (res < 0 ){ exit(res); } const char * classname = "Hello"; main_class = (*env)->FindClass(env, classname); if (check_error(main_class,env)) { return; } main_method = (*env)->GetStaticMethodID(env,main_class,"main", "([Ljava/lang/String;)V"); if (check_error(main_method,env)) { return; } char *message[5]= { "first", "second", "third", "fourth", "fifth"}; jstring blank = (*env)->NewStringUTF(env, ""); jobjectArray arg_array = (jobjectArray)(*env)->NewObjectArray (env,5, (*env)->FindClass(env,"java/lang/String"), blank); if (check_error(arg_array,env)) { return; } for(i=0;i<5;i++) { jstring str = (*env)->NewStringUTF(env, message[i]); (*env)->SetObjectArrayElement ( env,arg_array,i,str); } jobject result = (*env)->CallStaticObjectMethod(env, main_class, main_method, arg_array ); if (check_error(result,env)) { return; } }
Note that we have now linked against a version of the JVM, and that commits us to a given JDK. This is just a proof-of-concept; This whole copy the Jar approach is just a stepping stone.
What we really want is a classloader which plays by the following rules:
- lets the rt.jar classloader safely load the basic java, javax, and other JRE classes as a normal executable
- makes sure that all the classes for our application get loaded from our specified jar file
- reads our specified jar file directly out of the elf section of our binary executable
I appears to me that the standard classloading approach is fine for our needs with one exception: we can't treat an ELF file as if it were a Jar file. If we could do that, we could throw argv[0] on the front of the -Djava.class.path command line switch and be off and running. This would have the added benefit of creating a classloader extension that could be used from other Java programs as well. I'm currently thinking it should be of the form: file+elf://path-to-file/linked_jarfile_name
as that format would allow us to link in multiple jar files.
However, to be really efficient, we don't want to have to pay the price for the OS file operations (open, read) again, since the jar file is already loaded into memory. We should be able to create a protocol which tells the classloader: use this InputStream to load the class, and the JNI code can then pass the input stream.
This is a bit of a micro-optimization exercise, but either way there’s better ways to do it. Here’s one of them:
http://en.wikipedia.org/wiki/Binfmt_misc
This was more a thought experiment gone mad than anything else. Binfmt doesn’t help with the Classpath building, either, nor does it provide a framework for integrating Java in with Native libraries..that is what JNI is for. I also want to standardize how the Classpath building works for a subset of the Java applications in the distro, and being able to manage things at the Elf level will provide much of the same approach as systemd has for service starting. I want to get Socket activation working, as well as using Native sockets with something live Tomcat so it can listen on Ports under 1000 without having to run with root privs.
Really the whole approach is to make Java development more of a team player in a Linux distribution. Java is a great cross-platform development language, but that doesn’t mean it can’t be used for Linux specific tasks, This is just a spike in that direction.