GUI and Java and Javascript and Software08 Mar 2010 12:42 pm

Instead of using Google Windowing Toolkit, Rich Faces, Django, or any other server side technology to perform build the user interface, we should use Javascript and a client side library like JQuery.


Continue Reading »

C++ and Java and Software13 Feb 2010 07:49 pm

Dependency injection can throw you into analysis paralysis. Here are some rules of thumb.

Either a dependency is assigned for the lifetime of an object or it is passed in as the parameter of a method.

The mechanism that performs dependency injection is itself a dependency. Limit its scope to the edges of a use case.

I’ve worked on numerous distributed systems. Two of the abstractions used in Java distributred systems are Remote Method Invocation (RMI) Java Message Service (JMS). Both have their uses, and inversion of control should play nicely with either.

In both cases, the edges of the use case are when remote requests come across the wire. JMS vary naturally provides a tie in with an inversion of control container in the mapping of a newly received message to the code that is supposed to handle it:

HandlerMessageHandler = context.get(message.getType());

In an RMI system, each remote may provide a tie in to the context.  If there are no resources that are allocated solely for the purpose of processing the request, there is no need to create a new context.  On the other hand, Remote objects tend to be long lived, and often need access to resources only for a short time.  Thus the remote method will often need to create a thread scoped context for object resolution.  While this context can be provided by a proxy, this leads to a really awkward setup where the Remote object knows nothing about the context.  If  objects downstream from the remote object need access to the context, they haveto get it by Magic, and you end up with the same type of nasty code that you get in most JEE applications:  hard coded factories, JNDI lookups and the like.  If the client calls:

remoteObject->munge(myMessage);

The remote object has code like:

void munge(MyMessage){

Resource r = ResourceFactory.getInstance().create();

}

One alternative is to have the dependencies passed in to the object that implements the remote interface.  The awkwardness now is that the caller and implementer have two different contracts.

The client calls

remoteObject->munge(myMessage);

But the remote object implements

void munge(MyMessage myMessage, Resource resource);

Injection at the edges would lookl ike this:

void munge(MyMessage myMessage){

munge( myMessage, new Context<Resource>().get( ));

}

Here the remote object has transparency into the creation of the object.  In order to write a unit test,  we can call on the two parameter version of munge with a mock Resource object.  The main difference between the Context version and the Factory version is the unification of the object creation mechanism in the Context version.

As an aside:  If You need to have a dependency for a really short point in a time on the interior of a use case, you can use  a lazy-load proxy.  I don’t advise this, but it is an option.  The first problem with this approach is that it doesn’t provide a clean way to clean up once the object is no longer required.  The second is that object creation can fail, and the calling object may not cleanly handle that.

Java and Philosophy11 Feb 2010 12:43 pm

This is not an Anti-Java rant  Per Se.  It is a rant about the two main things missing from the language that force people into code heavy work-arounds.

Java has two flaws that hurt programmers using the language.  The first is that the reflection API does not provide the parameter names for a function.  The second is that Java allows null pointers.  This article explains why these two flaws are the impetus for many of the workarounds that require a lot of coding to do simple things.  This added complexity in turn leads to code that is harder to maintain and less performant.

An object in Java does not have any of the features of many true “Object Oriented” programming languages.  You can’t add properties or methods to an object after it has been created.  You need another abstraction for that kind of stuff: the map.  But Java provides introspection of the objects that make them map like.  An Object in Java is the “realization” of a Class, which is a set of rules.  The class exists to allow the programmer to define new rules about what a set of objects will do.  The idea is that the Class is the primary abstraction available to the programmer.  An Object has pre-conditions and post-conditions for any operations:  this will be true before and after this method is called.  These invariants are enforced by the Class of the Object.

This is the theory.  In practice, most Java classes violate this.  Java has one part of the problem built in to the language design of Garbage collection.  In C++ it is pretty common for an object to represent a resource.  Create the object, allocate the resource.  Free the object, release the resource. But garbage collection comes with a price:  you don’t know when your objects are freed, which means you can’t tie your resources to their object lifespans.  This is unfortunate, but it is a limitation of the language with which I can live.

However, just because we can’t tie clean up with resource release doesn’t mean we should allocate invalid objects.  However, this is done all over the place.  Lets look at the dependency injection model called setter injection.  Create an object, using the null constructor, and then call set, set ,set, and when you are done, you have an initialized object.  Note that type 1, or interface injection, is really just a more type safe way to do the same thing. There is no way of telling what is the minimum amount of work we have to do to get a valid object.  Do we have to call set on all properties?  The language already has a mechanism for answering this question.  THat is what the constructor is supposed to do.  Type 3 injection, constructor injection, then, looks like it should be the default way to go.  Why is it then so underused?

Imagine a language that gave you map, but no way to use the key. You could enumerate through all of the values, check their types, do all sorts of cool things, but you couldn’t look up values from the key. Programmers would probably complain? Yet the introspection of parameters in a java.lang.reflect.Method is limited to Types, not the names themselves.  The same is true of a java.lang.reflect.Constructor object.  We can get a collection of types, even a collection of annotations, but not a simple collection of strings for the names.  even if we did, there would be no way to create match that value with the object passed in as the  parameter.

Assume that you want to create an object of type DatabaseConnection.  To create this, you need a user ID, a password, and a JDBC URL.  Three strings.  To those of you who use objects like this regularly, you’ll notice that I changed the order.  The JDBC API usually has it as URL, Uid Password.  If all you know is that your API takes three strings, what order do you put them in?  You have to read the API docs.  Which is really not that useful if we want to make this an automated process. Ideally, the names of the parameters in the constructor would tell us which is which.

Note that if we used a specific type for UID, Password, and URL, we would have a guaranteed solution:  match the types of the parameters with the types of the objects that fill the dependencies.  But as soon as you have two objects of the same type, or any amount of casting, the policy becomes non-deterministic.

C++ aside:  C++ suffers from this just as much as Java, but C++ doesn’t even pretend to provide as much run time introspection that the failure there is just as bad.  Interesting to note that in modern  compilers, C, and by extension, C++ has allowed named parameters for structures, which can be used for this type of introspection, albeit a very chatty and non-runtime type.  Any one that suggest trying to demangle C++ functions will quickly see that A) any solution is non-portable and B) you lose the parameter names anyway.

Java has one other critical failing.  Null pointers.  If Java required that all references had a valid object connected to it, most of the justification for the Bean API would fall away.  If we defaulted most properties to final, the majority of objects would be immutable, and a whole slew of concurrency exceptions would fall away.  We would then just have to deal with the cases that an property was supposed to always exist, but be mutable.  This is why we have classes in the first case, and so these types of classes would be more common:  Wrap a primitive, but provide additional rules about what values it can assume. Without Null pointers, there would be no need for the Bean API.

Note that it would be easy to simulate a null pointer using a collection, or even an iterator.  A Collection would be empty.  An iterator would throw an exception to indicate that there was no “next” object.  This kind of Null pointer exception would be the exception, not the rule.

These two rules:  “no null objects” and “parameter type info” would significantly reduce the quantity of code written in Java while increasing reliability and correctness.

Java and Software09 Feb 2010 11:59 am

A programming language is a tool. When choosing the right tool for the job, you want to have good information about it. I’ve worked with both C and Java, and have dealt with a lot of misconceptions about both. I’m going to try and generate some data to use in helping guide discussions about the different languages. Consider this, then as the next instalment of my comparison of programming languages that I started in my IPv6 days.

This article will start with the simplest of comparisons: what is the overhead of starting a process in C and Java. Here’s my setup:

I am currently running Fedora 11.

My gcc version is 4.4.1 20090725 .

My version of Java is 1.6.0_0 from  OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode)

Here is my C Code:

int main(){
while(1){};
return 0;
}

It was compiled with:

gcc noop.c -o noop

Here is my Java Code

public class NoOp{

public static void main(String[] args){
while(true){

}
}
}

To compile Java that Java code I ran

javac NoOp.java

I then ran both:

./noop &

java NoOp &

Looking at the basics:

diff /proc/12067/status /proc/12069/status
1,4c1,4
< Name:    noop
< State:    R (running)
< Tgid:    12067
< Pid:    12067

> Name:    java
> State:    S (sleeping)
> Tgid:    12069
> Pid:    12069
12,13c12,13
< VmPeak:        3884 kB
< VmSize:        3756 kB

> VmPeak:     2559076 kB
> VmSize:     2493592 kB
15,17c15,17
< VmHWM:         308 kB
< VmRSS:         308 kB
< VmData:          40 kB

> VmHWM:       12304 kB
> VmRSS:       12304 kB
> VmData:     2451848 kB
19,22c19,22
< VmExe:           4 kB
< VmLib:        1548 kB
< VmPTE:          32 kB
< Threads:    1

> VmExe:          32 kB
> VmLib:       10632 kB
> VmPTE:         228 kB
> Threads:    12
28c28
< SigCgt:    0000000000000000

> SigCgt:    0000000181005ccf
37,38c37,38
< voluntary_ctxt_switches:    1
< nonvoluntary_ctxt_switches:    506847

> voluntary_ctxt_switches:    2
> nonvoluntary_ctxt_switches:    2

There are two things that jump out.  First, memory usage for both processes seems incredibly high.  Fora no-op C program to require, at any point in its lifespan, 3884 kB seems quite high.  The Java one, at a massive 2559076 kB borders on the absurd.  Java does have the excuse that the Application Java has certain parameters that are set for minimum and maximum memory usage, so it is possible that a good chunk of that memory was allocated by a system policy.

Another thing that jumps out is the context switches.  Something is forcing the C program to switch roughly 50k times.  The Java program has no such switching.

For the number of files pulled we have, for Java

cat /proc/12069/maps | awk ‘{print $6}’ | sort -u

/lib64/ld-2.10.1.so
/lib64/libc-2.10.1.so
/lib64/libdl-2.10.1.so
/lib64/libm-2.10.1.so
/lib64/libnsl-2.10.1.so
/lib64/libnss_files-2.10.1.so
/lib64/libpthread-2.10.1.so
/lib64/librt-2.10.1.so
/lib64/libz.so.1.2.3
/tmp/hsperfdata_ayoung/12069
/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/bin/java
/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/amd64/jli/libjli.so
/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/amd64/libjava.so
/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/amd64/libverify.so
/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/amd64/libzip.so
/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/amd64/native_threads/libhpi.so
/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/amd64/server/libjvm.so
/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/ext/gnome-java-bridge.jar
/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/ext/pulse-java.jar
/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/rt.jar

Whereas the C program has

cat /proc/12067/maps | awk ‘{print $6}’ | sort -u

/home/ayoung/devel/noop/noop
/lib64/ld-2.10.1.so
/lib64/libc-2.10.1.so

In both cases I removed some garbage from the input, but left the list of files.

OK, some one thing worth noticing is that Java holds on to the C libraries for dynamic library loading and for threading.  Perhaps a better comparison would include those, too.

Database03 Feb 2010 02:17 pm

One undervalued aspect of Data modeling is that you actually get time to consider the form of the data before you get the data. In a Map reduce job, you kow that your map phase is going to get the data, and that it is not going to be normalized . I could have said, not likely to be normalized, but the reality is that if you are using Map-Reduced, you are not going to get structured data.

The Map step is where you deal with this. You take the data in its CLOB form and you turn it into a series of key-value pairs. Strictly speaking, this isn’t a map, it is a relation. In a map Every element of the domain has a single element in the codomain, or range as I learned it. In Hadoop and Map reduce, there is no restriction that a given key always return a unique value, although I suspect that in practice it probably should. Actually, since all of the values for a given key are collected into a list, technically you do get a map, just not at the end of this stage…and really no where in the system do you ever see all the elements of that map. Just a sublist.

Regardless of the mathematical correctness of the term “map”, a Map reduce program has an step which is responsible for creating a structured representation from an unstructured. This is very similar to what a developer does when they have to take some data format and decide how to store it in a DBMS. The assumption is that the DBMS is necessary for processing the events afterwards.

Thus a Map-Reduce operation both defers the cost of normalizing the data, but then potentially pays it multiple times. When using a RDBMS, you pay the price for normalizing the data upon data entry which is then amortized over all the queries of the data. Thus the comparison between Map Reduce and SQL can be viewed as an economic decision.

Database and Networking and Sysadmin28 Jan 2010 08:30 am

Say you want to set up postgres for use with a web application. If you are running on the same server here’s what you need to do:

If the technology you are using is smart enough to use the domain socket for local connections, in

/var/lib/pgsql/data/pg_hba.conf
Apply the following diff:

# “local” is for Unix domain socket connections only
+local myapp-db myapp-user password
local all all ident sameuser

If, on the other hand, you need tcp connections (which is the case for jdbc)  you probably want this:

# IPv4 local connections:
-host    all         all         127.0.0.1/32          ident sameuser
+host    all         all         127.0.0.1/32          md5

Although, again, you should probably change “all all” to be specific to your application.

You need to restart the postgres database server to have these changes take effect.

To test this, you want to run the following command

sudo -u postgres /usr/bin/createuser –pwprompt  myapp-user

This will create the user you want, and prompt you for a password.  To log in locally, use.  To create the database itself:

sudo -u postgres /usr/bin/createdb myapp-db “My App backend data
storage” -O myapp-user

To test local connections (domain socket) run

psql myapp-db -U myapp-user

You should be prompted for the password

To test tcp connectivity, run

psql -h localhost myapp-db -U myapp-user

And again, you should be prompted for your password.  Some alternative tests to try, to make sure you “get it.”

  • Create an alternative database as the same user as you application user.  Make sure that Postgres rejects that account from psql when connection using a domain socket.
  • Attempt to connect to the alternative database as a remote user.  You should be allowed in.
  • Try this from a remote machine.  You should be rejected across the board.

To add an entry to the pg_hba.conf file allowing a specific remote machine to connect should look  like this:

host    myapp-db         myapp-user         192.168.1.1/32          md5

Tested only on RHEL5 and Fedora11, but this should work for Linux based PostgreSQL setups.  I suspect Windows as well, but I have not tested it.   The path to the config file will be very different.

Software12 Jan 2010 09:29 am

What follows is the results of a brainstorming session on items that should be in a code review checklist.  As you can see, it needs refining and grouping.  Please feel free to add comments with any items you think should be on it, with any organizational approaches, or any criticism.  Right now, I want to focus on inclusive instead of exclusive, so please don’t recommend removing things:  that willl happen later.

  • Does this code Swallow any exceptions (Bad)
  • Does this code Create any new unnecessary dependencies (Bad)
  • Does this code Introduce any performance bottlenecks (Bad)
  • Is this code threadsafe? (Good)
  • If this code is used only in a single thread, does is use any synchronization?(Bad)
  • Are data transfer objects serializable(Good)
  • Have a unit test(good)
  • Have a functional test(Good)
  • was the unit test run at 100%(Good)
  • Was the functional test run, were there any additional test failures(Bad)
  • Does this code change a public API. (Bad) If so is the change backwards compatable (Good)
  • Does this code have any functions that are more than 30 lines. (Bad)
  • Does this code have any magic numbers or string literals (Bad)
  • Does this code use appropriate internationalization mechanisms for all text visible to the end user (Good)
  • Does this checkin remove Lava code?
  • Is this code platform specific? Should it be?
  • Is this code intended to run on the server, the client, or the managed platform.
  • Does this code reproduce functionality that is done elsewhere in the code base.
  • Does this code use an eternal library with an incompatble library.
  • If this code uses a scripting language, does it hide errors in type safety?
  • Is this code reusable?
  • Does this code go against the coding style of the rest of the project?
  • Has this code been reviewed?
  • Does this code implement the design specified?
  • Has any user interface been reviewed by UX?
  • Has any Database interaction been reviewed by a DBA?
  • Does this code introduce any network roundtrips?
  • Is the message size for any network communication larger than an ethernet packet frame.
  • Can this code gracefully handle a network failure?
  • Does this code use any deliberate casting?
  • Does this code use aany APIs that will not be available to it at runtime?
  • Is this code understandable by someone that is not on the project?
  • Do all files have appropriate Copyright and license headers?
  • Do all public APIs have appropriate Javadoc/Doxygen/perldoc information?
  • Does this code introduce any unnecessary complexity?
  • Does this code work based on undocumented assumptions?
  • Have you made any changes in the code since the last time you ran through the unit and functional tests?
  • Have you stepped through the code in a debugger?
  • Are all reads and writes performed completely, or is there the possibility of missing information?
  • Are any created classes usable when just their contructors have been called, or do they require additional property sets afterwards?
  • Are all resources released when they are no longer needed?
  • Are fields that cannot be change tagged final/const?
  • If this class is going to be called via the bean api , are the appropriate fields exposed via getters and setters?
  • Do property setters handle null? Is null a valid option for them?
  • Must any code in this checkin live inside a transaction boundary?
  • Does any of this code directly manipulate a resource that is supposed to be encapsulated inside some other class or abstraction?
  • Does this code handle all possible exceptions that can be triggered by the code it calls into?
  • Does the code follow the project’s coding standard?
  • Have all files been run through a code formatter set to project specifications?

Contributions from Victor  Erminpour

  • Have we run static analysis tools on this code?
  • Does this code introduce any unintended side effects?
  • Does this code exist somewhere else?
  • Is this code generic and maintainable (i.e., if someone changes a class member, will my function still work?) .
  • What’s the performance impact of the code?
  • Can it be optimized?
  • Is the code secure?
  • Does it introduce and buffer/heap exploits?
Family and History02 Dec 2009 06:12 pm

The members of the team had rolled out the resilite mats in the back gym. The air was barely heated, so they had been hard to the touch as the boys rolled them in three straight sheets. The kinetic energy of a pair of teenage boys transferred to the friction of the shoes applied a sheering force that would separate untaped mats. That was acceptable during a normal practice, when the mats would be shared by a half dozen pairs at once. During a real match they would be taped together, to prevent them from separating during the bouts. The tape was an expense that the cash strapped athletic department wouldn’t waste on a practice. But there was no risk of separation during the opening half of this practice. The mats were rimmed with spectators, the members of the team focused on the two participants in the center. During a normal practice, the mats might be rolled out with either side up. The lesser used side had five circles, laid out like the dots on a die showing 5.


Continue Reading »

Family and History28 Nov 2009 06:47 pm
edith-ambrose-nursery-1975

Edith Ambrose Nursery School 1975.

JBoss and Java24 Nov 2009 07:28 am

These are my notes on how to reverse engineer what tags are doing in a JSF application. In this case, I am trying to figure out what are the classes behind the Configuration tags in the RHQ.  I am trying to figure out what is being done by the tag

onc:config

This tag is activated with the following value at the top of the page:

xmlns:onc=”http://jboss.org/on/component”

To Figure out what this tag means, I look in WEB-INF/web.xml.  The web.xml value

facelets.LIBRARIES

Lets me know where the file is that defines the acceptable tags I can add to an xhtml page for this component.

/WEB-INF/tags/on.component.taglib.xml

This taglib defines the values

tag-name config
component-type org.jboss.on.Config
renderer-type org.jboss.on.Config

Note that these are JSF component names, and not Java class names.  To resolve these to Java classes, we need to find the mappings.  The mapping files are defined in web.xml under the entry:

javax.faces.CONFIG_FILES

In particular, I found what I wanted in

/WEB-INF/jsf-components/configuration-components.xml,

The values I care about are:

component-type org.jboss.on.Config
component-class org.rhq.core.gui.configuration.ConfigUIComponent

and the renderer for Config and ConfigurationSet components

component-family rhq
renderer-type org.jboss.on.Config
renderer-class org.rhq.core.gui.configuration.ConfigRenderer

This render extends javax.faces.render.Renderer.  This is a Flyweight that parses and generates HTML.  It has two main methods: decode an encode. decode parses the request that comes in, encode injects html into the response that goes out.

Decode appears to be called only on a post.  Encode seems to be called even on a get.

Next Page »