The examples in The Art of Computer Programming (TAOCP) are in the MIXAL programming language. In order to see these examples run, I want to install the tools on my Fedora box. They are packaged as RPMS, so this is trivial. Here are the steps to run and debug a sample program in MIXAL.
First, install the MIXAL Development Kit, Or MDK.
That has some binaries that we will need to run,
rpmquery --list mdk
/usr/bin/gmixvm
/usr/bin/mixasm
/usr/bin/mixguile
/usr/bin/mixvm |
rpmquery --list mdk
/usr/bin/gmixvm
/usr/bin/mixasm
/usr/bin/mixguile
/usr/bin/mixvm
In proper fashion, we start with Hello World, courtesy of the Gnu docs: I won’t reproduce it here, but I will mention that you want to remove the comment after the HLT command.
[ayoung@ayoung-home mix]$ mixasm hello.mixal
[ayoung@ayoung-home mix]$ mixvm hello.mix
Program loaded. Start address: 3000
;;; note: auto-compilation is enabled, set GUILE_AUTO_COMPILE=0
;;; or pass the --no-auto-compile argument to disable.
;;; compiling /usr/share/mdk/mixguile-commands.scm
;;; compiled /home/ayoung/.cache/guile/ccache/2.2-LE-8-3.A/usr/share/mdk/mixguile-commands.scm.go
;;; compiling /usr/share/mdk/mixguile-vm-stat.scm
;;; compiled /home/ayoung/.cache/guile/ccache/2.2-LE-8-3.A/usr/share/mdk/mixguile-vm-stat.scm.go
MIX> run
Running ...
MIXAL HELLO WORLD
... done
Elapsed time: 11 /Total program time: 11 (Total uptime: 11)
MIX> |
[ayoung@ayoung-home mix]$ mixasm hello.mixal
[ayoung@ayoung-home mix]$ mixvm hello.mix
Program loaded. Start address: 3000
;;; note: auto-compilation is enabled, set GUILE_AUTO_COMPILE=0
;;; or pass the --no-auto-compile argument to disable.
;;; compiling /usr/share/mdk/mixguile-commands.scm
;;; compiled /home/ayoung/.cache/guile/ccache/2.2-LE-8-3.A/usr/share/mdk/mixguile-commands.scm.go
;;; compiling /usr/share/mdk/mixguile-vm-stat.scm
;;; compiled /home/ayoung/.cache/guile/ccache/2.2-LE-8-3.A/usr/share/mdk/mixguile-vm-stat.scm.go
MIX> run
Running ...
MIXAL HELLO WORLD
... done
Elapsed time: 11 /Total program time: 11 (Total uptime: 11)
MIX>
The below code changes the message from Hello World to the Alphabet in reverse order. This is in order to come up with something that we can sort, and test out the sorting and searching algorithms later in TAOCP
TERM EQU 19 the MIX console device number
ORIG 3000 start address
START OUT INPUT(TERM) output data at address MSG
HLT
INPUT ALF "ZYXWV"
ALF "UTSRQ"
ALF "PONML"
ALF "KJIHG"
ALF "FEDCB"
ALF "A "
END START end of the program |
TERM EQU 19 the MIX console device number
ORIG 3000 start address
START OUT INPUT(TERM) output data at address MSG
HLT
INPUT ALF "ZYXWV"
ALF "UTSRQ"
ALF "PONML"
ALF "KJIHG"
ALF "FEDCB"
ALF "A "
END START end of the program
[ayoung@ayoung-home mix]$ mixasm morph.mixal
[ayoung@ayoung-home mix]$ mixvm morph.mix
Program loaded. Start address: 3000
MIX> run
Running ...
ZYXWVUTSRQPONMLKJIHGFEDCBA
... done
Elapsed time: 11 /Total program time: 11 (Total uptime: 11) |
[ayoung@ayoung-home mix]$ mixasm morph.mixal
[ayoung@ayoung-home mix]$ mixvm morph.mix
Program loaded. Start address: 3000
MIX> run
Running ...
ZYXWVUTSRQPONMLKJIHGFEDCBA
... done
Elapsed time: 11 /Total program time: 11 (Total uptime: 11)
Now, let’s see if we can change a value in the output buffer. Lets’ start by loading a value into a buffer. I’ll just show the modified portion between START and INPUT:
START LD1 =1=
OUT INPUT(TERM) output data at address MSG
HLT
INPUT ALF "ZYXWV" |
START LD1 =1=
OUT INPUT(TERM) output data at address MSG
HLT
INPUT ALF "ZYXWV"
And what does this look like in output?
Program loaded. Start address: 3000
MIX> run
Running ...
ZYXWVUTSRQPONMLKJIHGFEDCBA A
... done |
Program loaded. Start address: 3000
MIX> run
Running ...
ZYXWVUTSRQPONMLKJIHGFEDCBA A
... done
Hmm. Where did that stray A come from? My guess is that it is the value I loaded into register 1. Let’s try a different value.
START LD1 =2=
OUT INPUT(TERM) output data at address MSG
HLT |
START LD1 =2=
OUT INPUT(TERM) output data at address MSG
HLT
MIX> run
Running ...
ZYXWVUTSRQPONMLKJIHGFEDCBA B
... done |
MIX> run
Running ...
ZYXWVUTSRQPONMLKJIHGFEDCBA B
... done
So loading a value into the 1 register affects the output. What about the 2 register?
START LD1 =1=
LD2 =2=
LD3 =3=
OUT INPUT(TERM) output data at address MSG |
START LD1 =1=
LD2 =2=
LD3 =3=
OUT INPUT(TERM) output data at address MSG
MIX> run
Running ...
ZYXWVUTSRQPONMLKJIHGFEDCBA A C B |
MIX> run
Running ...
ZYXWVUTSRQPONMLKJIHGFEDCBA A C B
They are out of order, but the data is still there.
Can we load in a value out of the INPUT buffer instead?
START ENT1 INPUT
OUT INPUT(TERM) output data at address MSG
HLT |
START ENT1 INPUT
OUT INPUT(TERM) output data at address MSG
HLT
Program loaded. Start address: 3000
MIX> run
Running ...
ZYXWVUTSRQPONMLKJIHGFEDCBA
... done |
Program loaded. Start address: 3000
MIX> run
Running ...
ZYXWVUTSRQPONMLKJIHGFEDCBA
... done
That seems to work, and no additional output. What if we do the same thing, but from somwhere else in the buffer? And store it at teh start? THat will show what gets moved where.
START ENT1 INPUT+1
ST1 INPUT
OUT INPUT(TERM) output data at address MSG
HLT |
START ENT1 INPUT+1
ST1 INPUT
OUT INPUT(TERM) output data at address MSG
HLT
But when I try to compile it:
[ayoung@ayoung-home mix]$ mixasm morph.mixal
morph.mixal:7: error: undefined symbol: INPUT
(0 warning(s), 1 error(s)) |
[ayoung@ayoung-home mix]$ mixasm morph.mixal
morph.mixal:7: error: undefined symbol: INPUT
(0 warning(s), 1 error(s))
I wonder if this is becuase the variable INPUT is not defined yet, and that the Load and Store operators are implmented differently. Let me see if I can define that variable before the ORIG command.
TERM EQU 19 the MIX console device number
INPUT ALF "ZYXWV"
ALF "UTSRQ"
ALF "PONML"
ALF "KJIHG"
ALF "FEDCB"
ALF "A "
ORIG 3000 start address
START ENT1 INPUT+1
ST1 INPUT
OUT INPUT(TERM) output data at address MSG
HLT
END START end of the program |
TERM EQU 19 the MIX console device number
INPUT ALF "ZYXWV"
ALF "UTSRQ"
ALF "PONML"
ALF "KJIHG"
ALF "FEDCB"
ALF "A "
ORIG 3000 start address
START ENT1 INPUT+1
ST1 INPUT
OUT INPUT(TERM) output data at address MSG
HLT
END START end of the program
That seems to compile correctly. When I run it:
Program loaded. Start address: 3000
MIX> run
Running ...
AUTSRQPONMLKJIHGFEDCBA
... done |
Program loaded. Start address: 3000
MIX> run
Running ...
AUTSRQPONMLKJIHGFEDCBA
... done
It seems that the start of the string is padded with blanks, and then the A character. Lets see if we can set this section of the buffer differently.
START LD1 =1=
ST1 INPUT
OUT INPUT(TERM) output data at address MSG
HLT |
START LD1 =1=
ST1 INPUT
OUT INPUT(TERM) output data at address MSG
HLT
MIX> run
Running ...
AUTSRQPONMLKJIHGFEDCBA
... done |
MIX> run
Running ...
AUTSRQPONMLKJIHGFEDCBA
... done
If 1 sets it to A, Maybe 2 would be B, and 5 would be E?
START LD1 =9=
ST1 INPUT
OUT INPUT(TERM) output data at address MSG
HLT |
START LD1 =9=
ST1 INPUT
OUT INPUT(TERM) output data at address MSG
HLT
Program loaded. Start address: 3000
MIX> run
Running ...
IUTSRQPONMLKJIHGFEDCBA
... done |
Program loaded. Start address: 3000
MIX> run
Running ...
IUTSRQPONMLKJIHGFEDCBA
... done
But when we get to 10, something strange happens.
START LD1 =10=
ST1 INPUT
OUT INPUT(TERM) output data at address MSG |
START LD1 =10=
ST1 INPUT
OUT INPUT(TERM) output data at address MSG
MIX> run
Running ...
~UTSRQPONMLKJIHGFEDCBA
... done |
MIX> run
Running ...
~UTSRQPONMLKJIHGFEDCBA
... done
Is the 10 not loaded as a deciaml 10? What if we try Hex?
START LD1 =A=
ST1 INPUT
OUT INPUT(TERM) output data at address MSG |
START LD1 =A=
ST1 INPUT
OUT INPUT(TERM) output data at address MSG
morph.mixal:13: error: undefined symbol: A
morph.mixal:13: error: invalid address field: =
(0 warning(s), 2 error(s)) |
morph.mixal:13: error: undefined symbol: A
morph.mixal:13: error: invalid address field: =
(0 warning(s), 2 error(s))
What if we set it to 9 and then add 1?
START LD1 =9=
INC1 1
ST1 INPUT
OUT INPUT(TERM) output data at address MSG |
START LD1 =9=
INC1 1
ST1 INPUT
OUT INPUT(TERM) output data at address MSG
MIX> run
Running ...
~UTSRQPONMLKJIHGFEDCBA
... done
Elapsed time: 16 /Total program time: 16 (Total uptime: 16) |
MIX> run
Running ...
~UTSRQPONMLKJIHGFEDCBA
... done
Elapsed time: 16 /Total program time: 16 (Total uptime: 16)
Same thing. OK, we don’t know how this work. What does the mixvm give us for inspection?
MIX> psym INPUT
+ 00 00 00 00 00 (0000000000) |
MIX> psym INPUT
+ 00 00 00 00 00 (0000000000)
I think…that the symbol INPUT is actually the Value 0, and the Data we want is stored at the address 0. If We define a variable before INPUT, maybe it will get bumped up higher?
Load and Store operations operate on full words: 4 bytes. But Register 1 can only store 2 bytes. What if we use the accumulator. The following program reads from INPUT into the accumulator, then stores the value into the OUTPUT buffer.
TERM EQU 19 the MIX console device number
OUTPUT EQU 2000
INPUT ALF "ZYXWV"
ALF "UTSRQ"
ALF "PONML"
ALF "KJIHG"
ALF "FEDCB"
ALF "A "
ORIG 3000 start address
START LDA INPUT
STA OUTPUT
OUT OUTPUT(TERM) output data at address OUTPUT
HLT
END START end of the program |
TERM EQU 19 the MIX console device number
OUTPUT EQU 2000
INPUT ALF "ZYXWV"
ALF "UTSRQ"
ALF "PONML"
ALF "KJIHG"
ALF "FEDCB"
ALF "A "
ORIG 3000 start address
START LDA INPUT
STA OUTPUT
OUT OUTPUT(TERM) output data at address OUTPUT
HLT
END START end of the program
And produced the following output:
MIX> run
Running ...
ZYXWV
... done |
MIX> run
Running ...
ZYXWV
... done
So, if we do the math correctly, we should be able to grab another set of 4 letters from the input buffer.
START LDA INPUT+3
STA OUTPUT
OUT OUTPUT(TERM) |
START LDA INPUT+3
STA OUTPUT
OUT OUTPUT(TERM)
MIX> run
Running ...
KJIHG
... done |
MIX> run
Running ...
KJIHG
... done
So…we need to use the right form of Load and Store if we want to operate on a single byte or two bytes. If we try the exact same code, but with Register 1:
START LD1 INPUT+3
ST1 OUTPUT
OUT OUTPUT(TERM) output data at address INPUT |
START LD1 INPUT+3
ST1 OUTPUT
OUT OUTPUT(TERM) output data at address INPUT
MIX> run
Running ...
HG
... done |
MIX> run
Running ...
HG
... done
You can see that the first two bytes get dropped on load, and then when we write to the buffer, they are still left as blanks, nulled out from register 1 not being that long. Lets try to grab just the first byte:
START LD1 INPUT+3(0:1)
ST1 OUTPUT
OUT OUTPUT(TERM) |
START LD1 INPUT+3(0:1)
ST1 OUTPUT
OUT OUTPUT(TERM)
MIX> run
Running ...
K
... done |
MIX> run
Running ...
K
... done
SCORE! The K is the same first character we got when we ran using the Accumulator. Lets see, now, if we can stick that value into the first byt of the buffer.
START LD1 INPUT+3(0:1)
ST1 OUTPUT(0:1)
OUT OUTPUT(TERM)
HLT |
START LD1 INPUT+3(0:1)
ST1 OUTPUT(0:1)
OUT OUTPUT(TERM)
HLT
MIX> run
Running ...
K
... done |
MIX> run
Running ...
K
... done
What is going on in the registers? Using the pall command in mixvm we can see. Before I execute anything, you can see that all of the registers are zeroed out.
[ayoung@ayoung-home mix]$ mixvm morph.mix
Program loaded. Start address: 3000
MIX> pc
Current address: 3000
MIX> pall
rA: + 00 00 00 00 00 (0000000000)
rX: + 00 00 00 00 00 (0000000000)
rJ: + 00 00 (0000)
rI1: + 00 00 (0000) rI2: + 00 00 (0000)
rI3: + 00 00 (0000) rI4: + 00 00 (0000)
rI5: + 00 00 (0000) rI6: + 00 00 (0000)
Overflow: F
Cmp: E |
[ayoung@ayoung-home mix]$ mixvm morph.mix
Program loaded. Start address: 3000
MIX> pc
Current address: 3000
MIX> pall
rA: + 00 00 00 00 00 (0000000000)
rX: + 00 00 00 00 00 (0000000000)
rJ: + 00 00 (0000)
rI1: + 00 00 (0000) rI2: + 00 00 (0000)
rI3: + 00 00 (0000) rI4: + 00 00 (0000)
rI5: + 00 00 (0000) rI6: + 00 00 (0000)
Overflow: F
Cmp: E
I can use strace on to track execution.
Once I execute the line LD1 INPUT(3:5)
we get
MIX> next
3001: [LD1 0,0(3:5) ] LD1 INPUT(3:5)
Elapsed time: 2 /Total program time: 3 (Total uptime: 3)
MIX> pall
rA: + 00 00 00 00 00 (0000000000)
rX: + 00 00 00 00 00 (0000000000)
rJ: + 00 00 (0000)
rI1: + 26 25 (1689) rI2: + 00 00 (0000)
rI3: + 00 00 (0000) rI4: + 00 00 (0000)
rI5: + 00 00 (0000) rI6: + 00 00 (0000)
Overflow: F
Cmp: E |
MIX> next
3001: [LD1 0,0(3:5) ] LD1 INPUT(3:5)
Elapsed time: 2 /Total program time: 3 (Total uptime: 3)
MIX> pall
rA: + 00 00 00 00 00 (0000000000)
rX: + 00 00 00 00 00 (0000000000)
rJ: + 00 00 (0000)
rI1: + 26 25 (1689) rI2: + 00 00 (0000)
rI3: + 00 00 (0000) rI4: + 00 00 (0000)
rI5: + 00 00 (0000) rI6: + 00 00 (0000)
Overflow: F
Cmp: E
Can we load those two bytes in separately? Where do they go in the Register? (I’ll stop showing the assembly, as it shows up in the trace)
Program loaded. Start address: 3000
MIX> next
3000: [OUT 0,0(2:3) ] START OUT INPUT(TERM)
ZYXWVUTSRQPONMLKJIHGFEDCBA
Elapsed time: 1 /Total program time: 1 (Total uptime: 1)
MIX> preg I1
rI1: + 00 00 (0000)
MIX> next
3001: [LD1 0,0(3:3) ] LD1 INPUT(3:3)
Elapsed time: 2 /Total program time: 3 (Total uptime: 3)
MIX> preg I1
rI1: + 00 27 (0027)
MIX> next
3002: [LD1 0,0(4:4) ] LD1 INPUT(4:4)
Elapsed time: 2 /Total program time: 5 (Total uptime: 5)
MIX> preg I1
rI1: + 00 26 (0026) |
Program loaded. Start address: 3000
MIX> next
3000: [OUT 0,0(2:3) ] START OUT INPUT(TERM)
ZYXWVUTSRQPONMLKJIHGFEDCBA
Elapsed time: 1 /Total program time: 1 (Total uptime: 1)
MIX> preg I1
rI1: + 00 00 (0000)
MIX> next
3001: [LD1 0,0(3:3) ] LD1 INPUT(3:3)
Elapsed time: 2 /Total program time: 3 (Total uptime: 3)
MIX> preg I1
rI1: + 00 27 (0027)
MIX> next
3002: [LD1 0,0(4:4) ] LD1 INPUT(4:4)
Elapsed time: 2 /Total program time: 5 (Total uptime: 5)
MIX> preg I1
rI1: + 00 26 (0026)
And if we load the two value together:
MIX> next
3001: [LD1 0,0(3:4) ] LD1 INPUT(3:4)
Elapsed time: 2 /Total program time: 3 (Total uptime: 3)
MIX> preg I1
rI1: + 27 26 (1754) |
MIX> next
3001: [LD1 0,0(3:4) ] LD1 INPUT(3:4)
Elapsed time: 2 /Total program time: 3 (Total uptime: 3)
MIX> preg I1
rI1: + 27 26 (1754)
So we can grab a character at a time or two bytes it we want. What about storing them in other than the first byte of the output location? I think the I field is what we need. But you can see from the following debug session that the commands
ST1 OUTPUT(0:1)
ST1 OUTPUT,1(0:1)
ST1 OUTPUT,2(0:1)
All have the same effect.
3001: [LD1 0,0(0:1) ] LD1 INPUT(0:1)
Elapsed time: 2 /Total program time: 3 (Total uptime: 3)
MIX> pall
rA: + 00 00 00 00 00 (0000000000)
rX: + 00 00 00 00 00 (0000000000)
rJ: + 00 00 (0000)
rI1: + 00 29 (0029) rI2: + 00 00 (0000)
rI3: + 00 00 (0000) rI4: + 00 00 (0000)
rI5: + 00 00 (0000) rI6: + 00 00 (0000)
Overflow: F
Cmp: E
MIX> next
3002: [ST1 2000,0(0:1)] ST1 OUTPUT(0:1)
Elapsed time: 2 /Total program time: 5 (Total uptime: 5)
MIX> pmem 2000
2000: + 29 00 00 00 00 (0486539264)
MIX> next
3003: [ST1 2000,1(0:1)] ST1 OUTPUT,1(0:1)
Elapsed time: 2 /Total program time: 7 (Total uptime: 7)
MIX> pmem 2000
2000: + 29 00 00 00 00 (0486539264)
MIX> next
3004: [ST1 2000,2(0:1)] ST1 OUTPUT,2(0:1)
Elapsed time: 2 /Total program time: 9 (Total uptime: 9)
MIX> pmem 2000
2000: + 29 00 00 00 00 (0486539264) |
3001: [LD1 0,0(0:1) ] LD1 INPUT(0:1)
Elapsed time: 2 /Total program time: 3 (Total uptime: 3)
MIX> pall
rA: + 00 00 00 00 00 (0000000000)
rX: + 00 00 00 00 00 (0000000000)
rJ: + 00 00 (0000)
rI1: + 00 29 (0029) rI2: + 00 00 (0000)
rI3: + 00 00 (0000) rI4: + 00 00 (0000)
rI5: + 00 00 (0000) rI6: + 00 00 (0000)
Overflow: F
Cmp: E
MIX> next
3002: [ST1 2000,0(0:1)] ST1 OUTPUT(0:1)
Elapsed time: 2 /Total program time: 5 (Total uptime: 5)
MIX> pmem 2000
2000: + 29 00 00 00 00 (0486539264)
MIX> next
3003: [ST1 2000,1(0:1)] ST1 OUTPUT,1(0:1)
Elapsed time: 2 /Total program time: 7 (Total uptime: 7)
MIX> pmem 2000
2000: + 29 00 00 00 00 (0486539264)
MIX> next
3004: [ST1 2000,2(0:1)] ST1 OUTPUT,2(0:1)
Elapsed time: 2 /Total program time: 9 (Total uptime: 9)
MIX> pmem 2000
2000: + 29 00 00 00 00 (0486539264)