How to extract from tgz, rpm, and deb files

The most common way to move a bundle of files around in Linux is a combination of tar (tape archive), which appends all of the files together into a single large file,  and gzip, a compression utility.  These are often referred to as tarball. They will have an extension like tar.gz or tgz.  Sometimes bzip2 is used as the comprseeion utility, and they file will end with tar.bz2.

To see the list of files in the  tarball for sage, run:

tar -ztf  sage-2.10.2-linux-ubuntu64-opteron-x86_64-Linux.tar.gz

To extract the files, create a target directory and change to it, thenextract the file from it’s original location

mkdir /tmp/sage

cd /tmp/sage

tar -zxf  ~/Desktop/sage-2.10.2-linux-ubuntu64-opteron-x86_64-Linux.tar.gz

All of the files will be put into the current directory.  There is no rule that says that all the files in a tarball are under on subdirectory, so it really behooves you to do this in a empty directory.  That way you know all of the files you see post extraction are files from the tarball.

Debian uses a package management system called dpkg based on this technology.  The packages will end with .deb, but you can see what Linux thinks the  file type is by using the file command.  Here it is run on automake_1.10+nogfdl-1_all.deb:

 adyoung@adyoung-devd$ file ~/Desktop/automake_1.10+nogfdl-1_all.deb
/home/adyoung/Desktop/automake_1.10+nogfdl-1_all.deb: Debian binary package (format 2.0)

To See the list of included files, run:

dpkg –contents automake_1.10+nogfdl-1_all.deb

And to extract use the –extract command line parameter.  Note that you have to supply the target directory as well.

dpkg –extract automake_1.10+nogfdl-1_all.deb /tmp/deb/

Again, make sure the target directory is empty to avoid intermingling your own files and files from the package.

The letters RPM stand for Redhat Package Manager.  The file extension rpm is used for packages of software designed to be installed on a some distributions of GNU/Linux.   RPM is used on Redhat, SuSE, and distributions based off of these two major distributions, such as Fedora, CentOS, and  OpenSuSE.  RPMs are shipped in a format called cpio. This format has the advantage of allowing longer file names, and providing stroage and compression all in one utility and format.  However, RPMs are not exactly cpio format, and you have to run a converter first, before you can extract the files.  This converter is called rpm2cpio.  It reads the filename in as the first command line parameter, and outputs the cpio file to standard output.  So if you run it without redirecting output, you are going to spew binary data all over your terminal.  Better to redirect it into the cpio utility, with the command line switches -di.  These switches mean extract the files, and build any subdirectories required.  Again, run this in a clean directory:

rpm2cpio ../testware-e.x.p-00000.i386.rpm | cpio -di

Don’t do Java bean properties in C++

The reason why Java has the bean API is to ensure that the caller does not delete the object when setting it.

void setX(String val){

if (val == null) return;

this.x = val;

}

This is not necessary in C++, since the parent object owns the memory of the child object.  The equivalent C++ code would be simple

o.x = val;

If you want to do validation (see my earlier posts about regex validation) use a subclass that does the validation in its constructor and assignment operator.

C++ optimization for string16

Since wchar_t is 32 bits on Linux, I need to transform wstring to a different type in order to call the ODBC functions. The Windows code, on the other hand, can just use wstrings c_str() function to access the internal representation of the string. My goal is to minimize the code differences between platforms. On Linux, I will create a new class for string16 see earlier post). On Windows, I will just typedef wstring to string16. My hope then is to get the OS specific code down to the Linux implementation of string16, and to have the typedef optimize away the differences.

Here is a simplified version of the code that will be built and run on Windows:

typedef wstring& sqlstring;

void dothing(wstring s){
sqlstring sql(s);
wcout << sql << endl;
}

void doanother(wstring s){
wcout << s << endl;
}

And the result of building and disassembling using g++. Note that the code is the same. Hopefully Windos C++ complier will behave the same.

0000000000400ae0 <_Z9doanotherSbIwSt11char_traitsIwESaIwEE>:
400ae0: 48 83 ec 08 sub $0x8,%rsp
400ae4: 48 89 fe mov %rdi,%rsi
400ae7: bf c0 12 60 00 mov $0x6012c0,%edi
400aec: e8 07 fe ff ff callq 4008f8 <_ZStlsIwSt11char_traitsIwESaIwEERSt13basic_ostreamIT_T0_ES7_R
KSbIS4_S5_T1_E@plt>
400af1: 48 83 c4 08 add $0x8,%rsp
400af5: 48 89 c7 mov %rax,%rdi
400af8: e9 0b fe ff ff jmpq 400908 <_ZSt4endlIwSt11char_traitsIwEERSt13basic_ostreamIT_T0_ES6_@pl
t>
400afd: 90 nop
400afe: 66 90 xchg %ax,%ax

0000000000400b00 <_Z7dothingSbIwSt11char_traitsIwESaIwEE>:
400b00: 48 83 ec 08 sub $0x8,%rsp
400b04: 48 89 fe mov %rdi,%rsi
400b07: bf c0 12 60 00 mov $0x6012c0,%edi
400b0c: e8 e7 fd ff ff callq 4008f8 <_ZStlsIwSt11char_traitsIwESaIwEERSt13basic_ostreamIT_T0_ES7_R
KSbIS4_S5_T1_E@plt>
400b11: 48 83 c4 08 add $0x8,%rsp
400b15: 48 89 c7 mov %rax,%rdi
400b18: e9 eb fd ff ff jmpq 400908 <_ZSt4endlIwSt11char_traitsIwEERSt13basic_ostreamIT_T0_ES6_@pl
t>
400b1d: 90 nop
400b1e: 66 90 xchg %ax,%ax

Reading C++ symbols in a binary

objdump -t will pull the raw symbols out of an elf binary, but it is mangled format like this

0000000000000000       F *UND*  0000000000000006              _ZNSt13basic_ostreamIwSt11char_traitsIwEElsEPFRS2_S3_E@@GLIBCXX_3.4

c++ file will translate this to a human readly string that maps to the rigianlfunction definition:

 adyoung@adyoung-devd$ echo _ZNSt13basic_ostreamIwSt11char_traitsIwEElsEPFRS2_S3_E@@GLIBCXX_3.4 | c++filt
std::basic_ostream<wchar_t, std::char_traits<wchar_t> >::operator<<(std::basic_ostream<wchar_t, std::char_traits<wchar_t> >& (*)(std::basic_ostream<wchar_t, std::char_traits<wchar_t> >&))@@GLIBCXX_3.4

To do this for an entire file, you want to print only the last column, a task custom made for awk:

 objdump -t   casttest | awk ‘{print $(NF)}’ | c++filt

Musings

Don’t hit publish on the blog when you just want to save a draft.

Big Builds are Bad. Software should be developed and distributed in small packages. Linux is successful due to things like apt, yum, and yast.

Interface Specifications need to be more specific.  Just saying that something is a string is not really helpful if that something needs to conform to a pattern.

Programming and blogging requires sugar in the brain.

Interviews are tricky…on both sides of the table. Career fairs are worse.

C++ Has a lot of magic in it. Can we make type level programming more transparent?

Microsoft purchasing Yahoo would be good for Google, but bad for just about everyone else.

Being a Dad is really cool. Even when it sucks, it is great. Sometimes kids refuse to go to sleep. This leads to sleep deprivation, but also leads to really wonderful moments in rocking chair in the middle of the night.

Pool is a great Geek game. Lower left-hand English is neat.

Snowshoes are good off the trail. Not so good on the trail. If your going on the trail, take the cross country skis. Snowmobiles smell funny.

New Hampshire winter weather is still as brutal today as it was when I left the area in the early ’90s.

It is hard to sing a Jazzy version of Old MacDonald had a Farm.  It is harder to do after the tenth repetition while trying to get a child to fall asleep.
If you listen to Children’s CDs long enough, you will develop favorite children’s songs. I like the hippo song.

Is there really a difference between the Ethernet and SCSI protocols? I don’t know, but it would be fun to find out.

The compiler is your friend. Let it check your work for you.

Why write code on a white board if you have a computer available? Especially if you have an overhead projector?

Where do the local peregrine falcons sleep? Where would they be sleeping if we hadn’t built up the whole area?

If I could have a redo on which language to take as a Sophomore, I would probably would have liked to take Chinese. Russian and Arabic would also do. German was not a good choice for me.

If Bush Senior had insisted on pushing to Baghdad, it would have been my generation in this mess as opposed to the current set of junior officers. Instead of Haiti, I would have gone to Basra or something.

There are too many interesting topics in the world to pursue them all, or even a small fraction of them.

Every philosopher I’ve read, especially the ones I disagree with, ave said something that is valuable and true.

No matter how old you are, when you get together with your parents, you revert to teenager status.

This list should never see the light of day.

Wrapping a boolean return value with an exception

I am currently working with a long block of code that uses && (logical and) to a long list of functions. The idea is to run all of the functions and short circuit if any of them fail. The problem is that there is no consistant mechanism for error reportin. If any of the functions fail, I have no way of knowing which one. This is my planned approach.

#include <iostream>
#include <stdexcept>
using namespace std;

bool invert(bool val){
return !val;
}

#define attempt( X ){ if (!X){ throw runtime_error(#X);}}

int main(){
cout << “OK” << endl;
try{
attempt(invert(true));
}catch(runtime_error& re){
cout << “runtime_error exception:” << re.what() <<endl;
}
return 0;
}

Why virtual, why not

Using a non-virtual method in C++ is syntactic sugar:  It doesn’t allow you to do anything you could not as easily do BY using a static member function and passing in the object instance as the first parameter.  If you don’t fully qualify the name of the function, the type checking of the first parameter will make sure you call the right version, or at least get a compilation error until you do fully qualify.The only reason you would need to use one is for template programming, where some classes have the method you wish to call as a virtual, and you want to be consistent across all classes passed to the template.

This is not to say that all non-virtual functions are bad, just an observation about what the language  provides.

Integrated Development Environments

I understand now why so many people that have cut their teeth on C coding in Unix/Linux hate C++.  A coder can get her job done in C without needing significant tool support.  An editor, a build toolchain, and limited reverse engineering support a-la objdump and you are good to go.  Tag support makes navigation a lot easier, and suddenly, you have decent productivity.

When I started in the professional world, I learned C++ using Microsoft Visual Studio.  I had, prior to then, been a big fan of Borland for Pascal, and had even purchased a copy of Borland C++.  But, unlike the MS product, it did not then have an integrated debugger, and my learning curve was quite steep.  MSVC took me through the steps, provided decent navigation, and easy to use wizards for MS technology.  I did have the frustration of trying to work through libraries without having the source code, but this wasn’t the VC developers fault.

When I left the MS world for the Java world, the first thing I noticed was the lack of an IDE.  I was just so productive with all of the tools integrated, that it took me a while to get used to coding in EMACS. I had long since learned the value of stepping through code after I wrote it to make sure it was doing what I expected. Using gdb, while effective, was much slower than an integrated debugger.  It wasn’t until I got my hands on Intellij’s Idea that I felt a return to the productivity I had under MSVC.

When refactoring support hit in Eclipse, I was stunned at the ease of maintaining code.  I can’t say enough about this, and may try to recreate some of my earlier writings on Refactoring in order to show how valuable the approach can be.  Suffice to say, it is now a go-to tool for coding that I miss when forced to do without.

Now I am trying to figure out how best to do C++ development on Linux.  etags just does not handle context well enough, and CScope’s navigation is fairly clunky.  I’ve got a demo of Visual Slickedit, and the Eclipse platform with C/C++ support built in, but I have yet to find a smooth working environment.  My problem comes from dealing with an alien code base, tightly coupled dependencies and byzantine makefiles that I haven’t had time to spelunk.  I’ve not been able to get automated refactoring support to work and even base navigation is problematic.  As I work out the solution, I will  record my approaches here so that I will be able to recreate them in the future.