In my last article, I posted a function for calculating one partition of a larger matrix. THe function looked like this
void partial(k, i, g, M, f){
for (m=0; m < n; m++){
j = m * k;
g[i] = g[i] + M[i][j] * f[i];
}
}
This is actually wrong. Lets look where I messed up. It was all the way back in the equation.
The equation I had looked something like this (not going to use inkscape to do the math this time)
g[i] = sum m=1..n of A[i,j]f[j]
When I divided it up, I fell victim to the 1-based array that I had and simply calculated j by multiplying m*k. This is wrong. If we think of it as a base 4 number, we want k to be the 4’s column (and larger) and m to be the ones column. In function form:
j=k*4+m
However, this implies that m goes from 0 to (n-1)/4.
Thus the partial function should look like this:
void partial(k, i, g, M, f){
for (m=0; m < n; m++){
j = k * n + m;
g[i] = g[i] + M[i][j] * f[i];
}
}
With that corrected, we can try to implement it in ARM64 assembly.