Using non-square rectangular blocking for a matrix multiplication kernel

Printable View