While asm.js will improve the core computations,
I haven't seen anything in that spec that would alleviate the
copy problem. As such, I think that either a well written JS
library or possibly an API like the one being proposed
is still the best bet for performance.
Of course the situation is different if the matrix
library is being used by a larger emscripten build (like a
physics library), at which point I absolutely believe that a
C++ library is the best option, but that's a very different
situation than the proposed API is trying to address.
Beyond that point, however, I generally agree with
your assessment of the API.
--Brandon