09-27-2012, 01:30 PM
I have some functions working with float8. During optimization it turns out that math with native lenght, like float4, vectors works faster. I don't want to re-write all code I have, but just split my float8 functions to two float4 functions calls.

void f_native(float4 a)
//do something in vector4 math

void f(float8 a)
float4* ta = &a;

Nvidea SDK issues warnings about this code. Is it any proper way to do such conversion without expencive performance overhead?

09-27-2012, 11:33 PM
It is my understanding that OpenCL does not allow indexing into built-in vector types using pointers. If you wish to accomplish this you should use *.hi, *.lo, etc. I believe the reason for this is that different vendors may store these types in different manners, such as big endian vs. little endian devices.