Vector Multiplication Using the Neon Coprocessor instructions on ARM64

Last post I showed how to do multiplication for a vector of integers using ARM64 instructions. Lots of use cases require these kinds of operations to be performed in bulk. The Neon coprocessor has instructions that allow for the parallel loading and multiplication of numbers. Here’s my simplistic test of these instructions.

Continue reading

Multiplication by Halving and Doubling in AARCH64 Assembly

While multiplication is defined in the context of repeated addition, implementing it that way algorithmically is not nearly as efficient as some other approaches. One algorithm for multiplication that is an order of magnitude faster is to halve one number while doubling the other. I gave myself the challenge of implementing this algorithm in AARCH64 Assembly, and it was not too hard.

Continue reading

Understanding the MIXAL insertion sort.

A debugger is a wonderful tool for understanding what actually happens in a piece of code. Donald Knuth’s coding in TAOCP is archaic enough that I do not understand it just by reading through. This is due to a combination of my unfamiliarity with MIXAL, as well as some of the coding conventions he’s chosen. So, I’m going to step through the MIXAL code in mixvm, and annotate what I find.

Continue reading