slow addition of vector-components using float16 ?!

Printable View