Splitting the irealpro 1400

The irealpro player has become an indispensable tool for practicing a tune. Since I play saxophone, the fact that irealpro will give me a full rhythm section to play against makes working on a tune far more possible than it was for me in the past.

I recently got a new phone and went to reinstall the set of songs that I use. They come from a collection called the Jazz 1400. This snuck up slightly in number from the last time I imported, and it must have hit a threshold, because my phone refuses to import it.

The collection is posted as an URL….not a pointer to a resource. The URL itself is the resource. It has all of the music embedded in it. I’m actually familiar with this format as I’ve put a few of my tunes in it. For example. here’s a tune I wrote called “Londontown Rain” in irealpro URL format:

irealbook://Londontown Rain=Young Adam=UpTempo Swing=D=n=[T44*A|D  G Bh7 |D  |Bh7 E7b9 |F#- |G- C7 |A- D7 |G7 F7 |B-7   |G7 F#7 |D    ]

The Jazz 1400 list is a concatenated set of songs in the ireal format which has been passed through a URL safe encoding. It is over a million characters long. My phone cannot parse an URL this long; apparently the iPhone’s can and those users don’t have this problem.

The first step is to split the file into lines. It took me a little trial and error to figure out the delimiter (I’ve only eve done one song at a time) but it turns out it is easy: === and you don’t even need to urldecode it. So, with this one line, I can split it into a line per song in a single file, and make it legible:

cat  irealpro-1400.url  | sed 's!irealb://!!;  s!===!===\n!g' | while IFS= read -r aline ; do urlencode -d $aline ; done > irealpro-1400.url.split

The urlencode step is only necessary if you want to read the song titles. On Ubuntu, that binary comes from the package: gridsite-clients

Once it is run, it looks like this:

irealb://9.20 Special=Warren Earl==Medium Swing=C==1r34LbKcu7bB,7B4D9,XQyX,C|QyX6-F|QXy,9D|QyX,6-F|Qy|sC7,4TA*{ ,7G|N1lD9Dl2NZL QyXQyX}G7,7bAs ,7G|QyX,9,XyQ|7A,7KQyX,*BC7,lcKQyX,7DZL lcQKyX,6FZL lcKQyX LZG7[] 6C7B,7C[*AD9,C|QyX,6-F|QyX9,D|QyX,6-F|QyX,XyQ|s]  lc,Bb7,A7|lD9,XyQ|G7, C6 Z ==0=0===
26-2=Coltrane John==Medium Up Swing=F==1r34LbKcu7ZL7bD4F^7 ZL7F 7-CZL7C 7A^ZL7E 7^bDZL7bABb^7 4T[A* 7^AZA7LZD^bDZL7bA 7^F[A]* 7C 7-GZL7G 7-7 E7L 7^bGC[B*]-7 F7FZL7C 7^AZL7E ^7bDZL7bA 7^bBZL^7XyQCZL7C7^bD|LZE-7A|QyX7-bE|QyX7b^BZL7F 7^DZL7A b7XyQ7F 7-BZL7F-7 C7L7C 7^AZL7E 7^DbZL7bA 7^F[A*] ZC-7 G|QyXb^7 Ab7LZDb^7 E7LZA^7 C7LZF^7   Z==0=0===
52nd Street Theme=Monk Thelonious==Up Tempo Swing=C==1r34LbKcu7L7G 74C A--A CZL7G 7-DZL-7A CZL7G 7-DZL77LZD-4TA*{ZL lcLZCXy7DZL lcKQyX6FZ LlcKQyX,7CB*[}Q,XyQK7G CZ7-A CKcl  7-DZL7-A CZL7G7 -DZL7-A ,CA*[] G7LZQyX7GLZD-7 G7LZC G7LZCXyQZ ==0=0===
500 Miles High=Corea Chick==Bossa Nova=E-==1r34LbKcu77E|Qy-7XyQL lcKQyX7^bBZLl cKQyX7-GZL lcKZBh7XE44T[QyX7-|A-7XlcKQyX7-FZL lcQKyX7h#FZL lcKQy QLZCQyX9#KQyX7ZB7#9 lcKQyX7-CQ{Y Q yXQyXZ  lcKQyXLZAb^L lcKcl  }==0=0===

In order to select a subset of the files I use awk and compare the NR variable to threshold values.

cat irealpro-1400.url.split | awk 'NR<700  {print $0 >> "split.a" } NR>=700 {print $0 >> "split.b"}'

Since we don’t really need to do the urlencoding/decoding for splitting the file, we can get this whole thing into a single line with no temporary files.

cat  irealpro-1400.url  | sed 's!irealb://!!;  s!===!===\n!g' |  awk 'BEGIN{ printf "irealb://" > "irealpro-1400.0-699.url" ;   printf "irealb://" > "irealpro-1400.700-1400.url"    }        NR<700  {printf "%s", $0 >> "irealpro-1400.0-699.url" } NR>=700 {printf "%s", $0 >> "irealpro-1400.700-1400.url"}'

Here’s what it produces, and what your probably came here for.