Building QGo on RHEL 7.5

I’ve played Go for years. I’ve found that having a graphical Go client has helped me improve my game immensely. And, unlike many distractors,. I can make a move, then switch back in to work mode without really losing my train of thought.

I always like the QGo client. I have found it to be worthwhile to build and run from the git repo. After moving to RHEL 7.5 for my desktop, I had to go through the process again. Here is the short version.

Continue reading

Debugging Techniques

Always look at the top error message in the spew that comes out from the compiler. This is usually where the problem is triggered. Junk after that is often spurious, especially if the first problem is a syntax error.

Use a revision control system like Git or Mercurial, and make a checkpoint of your code when ever you have made a significant addition. This way, you don’t get into the situation where you had it working, you broke it, and you can’t get it running again. Git has a little bit of a learning curve, but it rocks. If you are comfortable posting the code where other people can see, you can push your changes to github, and then if you have a catastrophic machine failure, you don’t lose your work.

The binary search is your friend. If you know that the error is somewhere between lines 10 and 20, Comment out lines 15-20 and see if you still have the problem. If so, comment out lines 13 and 14, and so on. A binary search means that you can search 1000 lines of code in 10 comparisons, and save yourself a metric shit-ton of time.  A metric shit-ton is an industry term.

Step through your code in a debugger, line by line, to make sure you know what it is actually doing, not what you think it is doing. Very illuminating. If you don’t have access to a debugger, make liberal use of trace statements. In gcc I often use:

#define TRACE() printf (“%s:%s:%d\n”,__FILE__,__FUNCTION__,__LINE__)

There is something comparable in most languages.

Always work from success. When I start a new C program, I start from

#include
int main(){ printf (“Hello, World.”);}

And compile and run between each minor change.

Don’t be afraid to ask for help. A second set of eyes will often quickly see what you missed. Even more so, when you explain the code to someone else, you often solve the problem yourself. This is known as “verbal debugging.”

Refactoring: Introduce Builder

As I move more and more towards Immutable objects, I find myself extracting the building logic out from the usage of an object on a regular basis.  The basic format of this pattern is

mutable object =>  immutable object + builder

There are two aspects to using an object: creating it, and reading its state. This pattern splits those two aspects up into different object, providing a thread safe and scalable approach to sharing objects without restoring to locking. Modifications of the object are reflected by creating a new object, and then swapping out the immutable

There are two ways to go about making the builder. If the object is created as part of user interface or networking code, it make sense to use a generic map object, and in the build method, confirm that all required properties are there.
so code like:

if (inputStringKey.equals(‘x’)) obj.setX(inputStringValue);

becomes

builder.set(inlutStringKey, inputStringValue)
obj = builder.build();

If, on the other hand, the original object tends to be modified by other code, it makes sense to split the object so that the setters are on the builder object.  Thus the above code becomes:

builder.setX( x);
obj = builder.build();

The immutable object shouldn’t have getters, you should use public properties for them, make those properties public and immutable themselves. In Java, this means tagging them as final, in C++ make them const.

It the immutable object is fetched from a data store, there are two approaches to updating that datastore.  The simplest and least flexible is to have the builder update the store as part of the build function.  This mixes functionality, and means that you can only update a single object per transaction, but it does increase consistency of viewing the object across the system.  The more common approach is to have a deliberate call to store the immutable object in the datastore.  Often, this is a transactional store, with multiple immutable objects sent in bulk as an atomic commit.

Here are the steps to perform this refactoring:

1. Append the word builder to the class name

2. Create a new internal class for holding the immutable state of the the object and give it the name of the original object.  Give  the builder and instance member field of the internal type.  The internal class should have a no-args constructor.

3.  Push all fields down to the new class by  moving the fields .

4. Add a method called build that returns the internal type.

5.  Replace all calls of the form getProperty()  with

x=  builder.build() ;

x.property;

It is important to use the local variable, and to share it amongst all the places in the code that reference the local object.  If you don’t you will end up with one instance per reference, probably not what you want.

6.  Remove the getProperty methods on the builder.

7.  Give the internal class a constructor that takes all of its fields as parameters.  It should throw a standard exception if one of the parameters is null.

8.  Change the build method so that it returns a new instance, that gets all of its fields set via the new constructor.  This instance should just use the fields in the internal instance of the (soon-to-be) immutable inner object.  You will have to provide dummy initializers for the newly immutable fields.  The build method should thrown an exception if any of the values are missing or invalid.  It is usually required that you return all errors, not just the first one.  Thus this exception requires a collection of other errors.  These can be exceptions, or some other custom type, depending on the needs of your application.

9.   For each setProperty method on the builder, provide a copy of the field in the buidler object.  Make the corresponding field in the inner object immutable, and change the build method to use the field in the builder instead.

10. When you have finished step 9, you should have an immutable inner object. Provide an appropriate copy constructor.

11. Remove the no-args constructor from the immutable object.

12. remove the default values for the immutable fields.

You can now create a new builder object based on a map.  The keys of the map should be the names of the fields of the builder.  The Build method pulls the elements out of the map and uses them as the parameters for the constructor. This approach showcases one of the great limitations of Java introspection:  Parameter names are lost after compilation.  We only have access to the types and the order of the parameter names.  Thus, maintaining this map would be error prone.

A more performant approach is to extend the builder object above with a setProperty(String, String) method that sets the values of the builder fields directly.

Any Java object can act as a map if you use introspection.  Thus, you could do what the Bean API does and munge the key into the form “setX” by changeing the case on the first letter of the key name, and then calling

this.getClass().getMethod()

You could also use  property introspection like this:

this.getClass().getProperty(key)

Since you are using introspection, even though you are inside the class that should have access to private members, Java treats you as an outsider.  You can either drop permissions on the fields at compile time by making them public, or do so at runtime using one of various hacks.

This is one case where it makes sense to make the member fields public.  There is no information hiding going on here.  There may actually be some client codethat is better off calliing

builder.property = x

Than

builder.setProperty(x)

In C++, we have fewer choices, as there is no way either at run time nor at compile time to provide an iteration through the fields of a class.  The best you can do is to create a map of functors.  The keys of the map are again the fields of the builder, the values are functions which set the fields.  You end up with a setProperty function on the builder that looks like:

void setProperty(String& key, Strin& value){

propsetmap[key](value);

}

although with logic to handle erroneous keys.

A builder is a short lived, single threaded, mutable object.  It is up to the calling code to provide enough data to populate the builder.  The builder pattern works nicely with  inversion of control frameworks.  Code that uses an object should not call the builder directly, but rather fetch the object from the framework, where as code else where builds the object and adds it to the container.

If your code has interspersed sets and gets of properties, it is going to be really difficult to introduce the builder.  Chances are you are going to want to do additional refactorings to separate the concerns in your code.

cpp-resolver

I’ve finally created my own open source project.  I’ve taken the cpp resolver code and posted it on source forge.  Let the bug reports commence!

http://sourceforge.net/projects/cpp-resolver/

I’ll probably copy the articles describing it over there at some point as well.

Context Map of an Application

Of all of the inversion of control containers I’ve come across, the one that most matches how I like to develop is Pico container. What I like best about it is that I can code in Java from start to finish. I don’t like switching to a different language in order to define my dependencies. Spring and JBoss have you define your dependencies in XML, which means that all of the Java tools know nothing about it, and javac can’t check your work. You don’t know until run time if you made a mistake.

One reason people like XML is it gives a place to look. You know that you are looking for the strategy used to create an object. The web.xml file provides you a starting point to say “Ah, they are using the struts servlet, let me look for the struts config XML file, and then….” Of course, this implies that you know servlets and struts. Come at a project with no prior knowledge puts you into murkier waters.

An application has a dynamic and a static aspect to it. The dynamic aspect can be captured in a snapshot of the register state, the stack, the heap, and the open files. The static structure is traditionally seen as the code, but that view is a little limiting. Tools like UML and ER Diagrams give you a visual representation easier to digest. We need a comparable view for IofC.

Many applications have a structure of a directed acyclic graph. The servlet model has components that are scoped global, application, session, request, and page. Each tier of the component model lives a shorter lifetime than the next higher level. However, this general model only provides context in terms of http, not in context of your actual application. For instance, if you have a single page that has two forms, and wish to register two components that represents a button, there is no way to distinguish which form the button is inside. Or, if an application has multiple databases, say one for user authentication and a different one for content, but both are registered as application scoped components, the programmer has to resort to naming the components in order to keep them separate.  While it is not uncommon to have multiple instances of the same class inside of a context scope, keeping the scope small allows the developer to use simple naming schemes to keep them distinct, and that naming scheme itself can make sense within the context of the application. For example, if an application reads from two files, one containing historical user data and one containing newly discovered user information,  and performs a complex merge of the application into an output file, the three objects that represent the files can be  named based on the expected content of the files as well as their role.  If there is another portion of the application that does a something like this, but with product data, and the two parts really have little to no commonality of code, the file objects will end up getting the context as part of the registration.

  • fetchHistoricalUserDataFile
  • fetchNewUserDataFile
  • fetchHistoricalProductDataFile
  • fetchNewProductDataFile

Note now that the application developer must be aware of the components registered elsewhere in the application to deconflict  names, and that we start depending on naming conventions, and other processes that inhibit progress and don’t scale.

We see a comparable concept in the Java package concept.  I don’t have to worry about conflicting class names, so long as the two classes are in separate packages.

To define an application, then, each section should have a container.  The container should have a parent that determines the scope of resolution.  The application developer should be comfortable in defining new containers for new scopes.  Two things that need access to the same object need to be contained inside of descendants of the container of that dependency.

A tool to make this much more manageable would produce a javadoc like view of the application.  It would iterate through each of the containers, from parent down the tree, and show what classes were registered, and under what names.  This would provide a much simpler view of the overall application than traversing through XML files.

Dependency Collectors

Certain portions of an application function as a registration point, whether they are in the native language of the project or a configuration file read in. These files provide a valuable resource to the code spelunker. For instance, when starting to understand a Java web archive, the standard directory structure with WEB-INF/web.xml provides a very valuable starting point. Just as reading C Code you can start with main. The dependency Collections often are an xml file, like struts-config.xml, or the Startup portion of a Servlet.

The concept in Inversion of Control is that you separate the creation policy of the object from from the object itself, such that the two can be varied independently. Often, a project that otherwise does a decent job of cutting dependencies via IofC will build a dependency collector as a way to register all of the factories for the components. The xml files that Spring uses to define all of the control functions are dependency collectors just as surely as a C++ file with an endless Init function that calls “registerFactory” for each component in the inventory.

As you might be able to tell from my tone, I respect the usefulness of the dependency collector, but still feel that there is a mistake in design here. In C++, you can specify a chunk of code guaranteed to run before main that will initialize your factories, so the language provides support for IofC. In Java, classes can have static blocks, but this code only get executed if the class file is somehow referenced, which means this is not a suitable mechanism for registering factories. The common approach of using XML and Introspection for factory registration violates the principle of not postponing until runtime that which should be done at compile/link time.

So I give myself two goals. 1) To find a suitable Java based mechanism for registering factories and 2) to provide a method to compensate for the lack of orientation that a dependency collector provides.

Compile Time Dynamic Proxies in C++

These are my notes for compile time proxies generated from C++.  I’m not sure I will be able to understand them in the future, so good luck to you if you feel the need to read them.

Java Dynamic proxies are a well established means of reducing code by extracting a cross cutting concern. The C++ philosophy is more “Why put off to runtime that which can be performed at compile time.” How would we get the same kind of flexibility from C++ as we get from Java Dynamic proxies?

First, we would need a handful of helper classes that mimic the introspection API of Java. If we have the simple classes of Method, Field, Parameter, and Class, we can perform much of the logic we need. Refer to the Java reflexion API to see roughly what these classes should contain and what they do.

Code generation is the obvious approach, and the lack of introspection of the C++ makes abstract syntax tree analysis  it the only viable approach currently available. We can get all the information we require from g++ if we just ask nicely. FOr example, if we add the flag -fdump-translation-unit to g++ we get the file with the AST in an ultra-normalized form. For example, I want to find all of the classes defined in the file generated when I compile ExampleTestCase.cpp. The file ExampleTestCase.cpp.t00.tu on line 414 has:

@1086 identifier_node strg: ExampleTestCase lngt: 15

If we then search for what @1086 means:

adyoung@adyoung-devd$ grep -n “@1086 ” ExampleTestCase.cpp.t00.tu

1749:@783 type_decl name: @1086 type: @554 srcp: ExampleTestCase.h:14
1762:@787 function_decl name: @1086 type: @1093 scpe: @554
2414:@1086 identifier_node strg: ExampleTestCase lngt: 15
4237:@1932 type_decl name: @1086 type: @554 scpe: @554
4242:@1935 function_decl name: @1086 mngl: @2450 type: @2451
28445:@13185 function_decl name: @1086 mngl: @14801 type: @14802
We see that this identifier is used several places, but the two interesting ones are the type_decl lines, and they both refer to entry @554. Most likely the function definitions are something like the constructors. This is the data on that record:

@554    record_type      name: @783     size: @43      algn: 64
vfld: @784     base: @785     accs: priv
tag : struct   flds: @786     fncs: @787
binf: @788

It needs some prettying up, to get it all on one line, but other than that, it looks right. The big thing is the tag: struct that tells us this is a c struct. C++ must be forced to conform to c at some point, so classes become structs.

Let’s take it even simpler.  If we make an empty C++ file, called empty.cpp and compile it with:

g++   -fdump-translation-unit   -c -o empty.o empty.cpp

we get a file with a lot of standard symbols defined:

grep identifier empty.cpp.001t.tu | wc -l
1215

If we add a single static variablle, the venerable xyzzy, we can easily find it in the file:

adam@frenzy:~/devel/cpp/proxy$ echo “static int xyzzy;” >> xyzzy.cpp
adam@frenzy:~/devel/cpp/proxy$ g++   -fdump-translation-unit   -c -o xyzzy.o xyzzy.cpp
adam@frenzy:~/devel/cpp/proxy$ grep identifier  xyzzy.cpp.001t.tu | wc -l
1216

We’ve only added a single line, that looks like this:

@4      identifier_node  strg: xyzzy    lngt: 5

If we now add a Noop struct to that, we get a little bit more info:

adam@frenzy:~/devel/cpp/proxy$ echo “struct Noop{}; static int xyzzy;” >> Noop.cpp
adam@frenzy:~/devel/cpp/proxy$ make Noop.o
g++  -fdump-translation-unit    -c -o Noop.o Noop.cpp
adam@frenzy:~/devel/cpp/proxy$ grep identifier  Noop.cpp.001t.tu | wc -l
1217

Note that I’ve added -fdump-translation-unit  to the CPPFLAGS in a Makefile.

Each change has a significant effect on the resultant file:

adam@frenzy:~/devel/cpp/proxy$ wc -l Noop.cpp.001t.tu
6853 Noop.cpp.001t.tu
adam@frenzy:~/devel/cpp/proxy$ wc -l xyzzy.cpp.001t.tu
6845 xyzzy.cpp.001t.tu
adam@frenzy:~/devel/cpp/proxy$ wc -l empty.cpp.001t.tu
6841 empty.cpp.001t.tu

Because the symbol gets added early (@4) it bumps all of the other symbols in the file up one, so a diff would take a little parsing.  A visual inspection quickly shows that the following section has been added to xyzzy.cpp.001t.tu

@3      var_decl         name: @4       type: @5       srcp: xyzzy.cpp:1
chan: @6       link: static   size: @7
algn: 32       used: 0
@4      identifier_node  strg: xyzzy    lngt: 5
@5      integer_type     name: @8       size: @7       algn: 32
prec: 32       sign: signed   min : @9
max : @10

If we compare the two files based on the @ signs:

adam@frenzy:~/devel/cpp/proxy$ grep — @ xyzzy.cpp.001t.tu | wc -l
4427
adam@frenzy:~/devel/cpp/proxy$ grep — @ empty.cpp.001t.tu | wc -l
4424

We can see we have added three, which corresponds with what we have above.

Just adding the emptyr struct adds 10 lines:

adam@frenzy:~/devel/cpp/proxy$ grep — @ Noop.cpp.001t.tu | wc -l
4434.

To make iut a little easier, I went in and put a carriage return after struct Noop{};  Now I can look for Noop.cpp:1 or Noop.cpp:2

This eems to be the set of lines added for struct Noop:

@6      type_decl        name: @11      type: @12      srcp: Noop.cpp:1
note: artificial              chan: @13
@7      integer_cst      type: @14      low : 32
@8      type_decl        name: @15      type: @5       srcp: <built-in>:0
note: artificial
@9      integer_cst      type: @5       high: -1       low : -2147483648
@10     integer_cst      type: @5       low : 2147483647
@11     identifier_node  strg: Noop     lngt: 4
@12     record_type      name: @6       size: @16      algn: 8
tag : struct   flds: @17      binf: @18

Let’s see what happens if we add field.

Here’s OneOp.cpp

struct OneOp{
int aaa;
};
static int xyzzy;

adam@frenzy:~/devel/cpp/proxy$ grep — @ Noop.cpp.001t.tu | wc -l
4434
adam@frenzy:~/devel/cpp/proxy$ grep — @ OneOp.cpp.001t.tu | wc -l
4439

We get another five lines.  Let’s see if this is linear.

adam@frenzy:~/devel/cpp/proxy$ grep — @ TwoOp.cpp.001t.tu | wc -l
4444

adam@frenzy:~/devel/cpp/proxy$ grep — @ ThreeOp.cpp.001t.tu | wc -l
4449

Let’s try a function now.

adam@frenzy:~/devel/cpp/proxy$ cat OneFunc.cpp
struct OneFunc{
int narf();
};
static int xyzzy;

adam@frenzy:~/devel/cpp/proxy$ grep — @ OneOp.cpp.001t.tu | wc -l
4439
adam@frenzy:~/devel/cpp/proxy$ grep — @ OneFunc.cpp.001t.tu | wc -l
4448

About double the info.

My next goal will be to diagram out the data structures we have here using UML.

Things look fairly straight forward in the decifering until we get to function_type.  There, we have a reference to retn which in this case happens to be a void, but could concievably be any of the data types.

I have long since abandonded this approach, but may pick it back up again some day, so I will publish this and let the great crawlers out there make it avaialble to some poor sap that wants to continue it.  If you do so, please let me know.

Proxies in C++

The Proxy design pattern and Aspect Oriented Programming have the common goal of extracting cross cutting concerns from code and encapsulating them.  A cross cutting concern usually happens on a function boundary:  check security, object creation and so on.  Proxies allow you to make an object that mimics the interface of the called object, but which provides additional functionality.

For an inversion of control container, object dependency and object creation may follow two different policies.  If Object A needs and Object of type B, that dependency should be initialized when object A is created.. However, if creating object B is expensive, and object B is not always needed, object B should be created on Demand.  This approach is called “Lazy Load” and it is one of the types of proxies that the Gang of Four book enumerates.

Java provides a mechanism to make a proxy on the fly. The use of the proxy object provides a function

public Object invoke(Object proxy, Method m, Object[] args)
throws Throwable

Let’s define a C++ class as a pure abstract base class:

class Interface {
public:
virtual void action1(int i) = 0;
virtual void action2(int j) = 0;
}

And a class that implements that interface with some side effect.

class RealClass :public Interface {

int val;

public:

void action1(int i){val = i;}

void action2(int i){val = 333 * i;}

};

Then a Lazy Load Proxy would be defined like this:

typedef Interface* (* create_delegate_fn());

class LazyLoadProxy : public Interface  {
create_delegate_fn* fetcher;
Interface* delegate;
Interface* fetch(){
if (!delegate){
delegate = (*fetcher());
}
return delegate;
}
public:
LazyLoadProxy(create_delegate_fn create_delegate):
delegate(0)
{
fetcher = create_delegate;
};

virtual void action1(int i){
fetch()->action1(i);
};
virtual void action2(int j){
fetch()->action1(j);
};
}

This cannot be completely templatized, but a good portion of it can be abstracted away, leaving the compiler to check your work for the rest.   If we want to tie this into out inversion of control framework, we need to make sure that the create_delegate has access to the same Zone used to create the Proxy object.  Thus the Zone should be stored in a member variable of the Dynamic proxy.  We should really tie this into the resolver.h code from previous posts, and pass the Zone along to be stored the lazy load proxy.  It is also likely that you will want the lazy load proxy to own the delegated item, so you may want to add a virtual destructor to the interface (always a good idea), and then delete the delegate in the destructor of the proxy.  Here’s the templatized code:

#include <resolver.h>

template <typename T>  class LazyLoadProxy : public T  {
public:
typedef T* (*create_delegate_fn)(dependency::Zone&);

private:

T* (*fetcher)(dependency::Zone&);
T* delegate;
dependency::Zone& zone_;

protected:
T* fetch(){
if (!delegate){
delegate = (fetcher(zone_));
}
return delegate;
}
public:
LazyLoadProxy(dependency::Zone& zone,create_delegate_fn create_delegate):
zone_(zone),
delegate(0)
{
fetcher = create_delegate;
};

virtual ~LazyLoadProxy(){
if (delegate){
delete delegate;
}
}
};

And the code specific to creating and registering the Interface version of the LazyLoadProxy is:

class InterfaceLazy : public LazyLoadProxy<Interface>  {
public:
InterfaceLazy(dependency::Zone& zone, create_delegate_fn create_delegate):
LazyLoadProxy<Interface>(zone, create_delegate)
{
};

virtual void action1(int i){
fetch()->action1(i);
};
virtual void action2(int j){
fetch()->action1(j);
};
};

static Interface* createReal(dependency::Zone& zone){
return new RealClass;
}

static  Interface* createProxy(dependency::Zone& zone){
return new InterfaceLazy(zone, createReal);
}

DEPENDENCY_INITIALIZATION{
dependency::supply<Interface>::configure(0,createProxy);
return true;
}

Java dynamic proxies reduce the code for the proxy down to a singe function that gets executed for each method on the public interface, with the assumption that any delegation will be done via the reflection API.  C++ Does not have a reflection API, so we can’t take that approach.  If the C++ language were extended to allow the introspection of classes passed to a template, we could build a similar approach at compile time by providing a simple template function that gets expanded for each method of the abstract interface.

Dynamic proxies that are parameter agnositc are possible in C++, but are architecture specific, and depend on the parameter passing convetion.  I’m looking in to this, and will publish what I find in a future article.

Move to Red Hat

Sometimes you can’t tell where you are headed. But, after a while, if you look back, you realize that you have been headed in a straight line exactly where you want to go. Such is the case, I find, with my current acceptance of an offer of employment at Red Hat.

Very shortly, I will take a position as a senior software engineer at Red Hat, in Westford , MA. I am on the team responsible for, amongst other things, Red Hat Satellite Server. This pulls together several two trends in my career: Java, Linux, Systems Mangement, and JBoss.  I look forward to posting lessons learned from this new venture.