Results 1 to 3 of 3

Thread: Float4* VS Float*

  1. #1
    Junior Member
    Join Date
    Nov 2012
    Posts
    2

    Float4* VS Float*

    I'm new in OpenCL. I check code from somebody else and it's look like this

    Code :
    struct Scene
    {
    __global float* vertics;
    ...
    }

    "vertics" is an array of float, but inside you have POSITION and NORMAL.
    To get the position We have this fonction


    Code :
    inline float4 GetVertexPosition(__local struct Scene *s, uint vertexID)
    {
        __global float* offset = 0;
        offset = s->vertics + vertexID * 8;
     
        return (float4)(*offset, 
                        *(offset + 1), 
                        *(offset + 2), 
                        1.0f);

    and to get normal


    Code :
    inline float4 GetVertexNormal(__local struct Scene *s, uint vertexID)
    {
        __global float* offset = 0;
        offset = s->vertics + vertexID * 8;
        return (float4)(*(offset + 4), 
                        *(offset + 5), 
                        *(offset + 6), 
                        0.0f);


    I know, when we program in HLSL it'S better to use float4 directly when we can. Then I try this easy change to see if it's better


    Code :
    struct Scene
    {
        __global float4* vertics;
        ...
    }


    Code :
    inline float4 GetVertexPosition(__local struct Scene *s, uint vertexID)
    {
        return s->vertics[vertexID * 2];
    }



    Code :
    inline float4 GetVertexNormal(__local struct Scene *s, uint vertexID)
    {
        return s->vertics[vertexID * 2 + 1];
    }




    I profiled each example. The first one is faster. Not a huge difference, but still faster. I tought it's should be faster to use float4* directly instead of float* and convert into a float4.

    I use the same buffer in each situation, then alignement should be the same. I only change what I wrote.

    Somebody can explain why it's faster to use float*?

    Thanks

  2. #2
    Senior Member
    Join Date
    Oct 2012
    Posts
    166

    Re: Float4* VS Float*

    Hi,

    there shouldn't be such a huge difference in execution time. Main reason why float4 is faster on GPU architecture is that the GPU architecture is optimized for float4 data. The memory Controller always gets you chunks of 128 Byte of Data. Look for coalesced memory access to get a better idea of the problem.

    Greetings,
    clint3112

  3. #3
    Junior Member
    Join Date
    Nov 2012
    Posts
    2

    Re: Float4* VS Float*

    Thanks, I will read about
    coalesced memory access

    I know GPU is optimised for Float4 . All register are float4. For this reason I tried to change this .

Similar Threads

  1. float4 and possible out of range?
    By sean.settle in forum OpenCL
    Replies: 4
    Last Post: 02-02-2012, 09:48 PM
  2. Passing a float4 as a pointer
    By fmilano in forum OpenCL
    Replies: 1
    Last Post: 01-13-2010, 12:24 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •