PDA

View Full Version : Float4* VS Float*



qqchose
11-19-2012, 12:12 PM
I'm new in OpenCL. I check code from somebody else and it's look like this



struct Scene
{
__global float* vertics;
...
}


"vertics" is an array of float, but inside you have POSITION and NORMAL.
To get the position We have this fonction




inline float4 GetVertexPosition(__local struct Scene *s, uint vertexID)
{
__global float* offset = 0;
offset = s->vertics + vertexID * 8;

return (float4)(*offset,
*(offset + 1),
*(offset + 2),
1.0f);


and to get normal




inline float4 GetVertexNormal(__local struct Scene *s, uint vertexID)
{
__global float* offset = 0;
offset = s->vertics + vertexID * 8;
return (float4)(*(offset + 4),
*(offset + 5),
*(offset + 6),
0.0f);



I know, when we program in HLSL it'S better to use float4 directly when we can. Then I try this easy change to see if it's better




struct Scene
{
__global float4* vertics;
...
}




inline float4 GetVertexPosition(__local struct Scene *s, uint vertexID)
{
return s->vertics[vertexID * 2];
}






inline float4 GetVertexNormal(__local struct Scene *s, uint vertexID)
{
return s->vertics[vertexID * 2 + 1];
}





I profiled each example. The first one is faster. Not a huge difference, but still faster. I tought it's should be faster to use float4* directly instead of float* and convert into a float4.

I use the same buffer in each situation, then alignement should be the same. I only change what I wrote.

Somebody can explain why it's faster to use float*?

Thanks

clint3112
11-20-2012, 12:31 AM
Hi,

there shouldn't be such a huge difference in execution time. Main reason why float4 is faster on GPU architecture is that the GPU architecture is optimized for float4 data. The memory Controller always gets you chunks of 128 Byte of Data. Look for coalesced memory access to get a better idea of the problem.

Greetings,
clint3112

qqchose
11-20-2012, 05:41 AM
Thanks, I will read about
coalesced memory access

I know GPU is optimised for Float4 :). All register are float4. For this reason I tried to change this :p.