Results 1 to 3 of 3

Thread: const variable / memory latency

  1. #1
    Junior Member
    Join Date
    Nov 2010
    Posts
    9

    const variable / memory latency

    Hi,

    I have this kernel:

    __kernel void some_kernel(__global int* in_array, __global int* out_array)
    {
    uint tid = get_global_id(0);
    out_array[tid] = in_array[tid] *3;
    }

    in_array is initialized in the beginning of my program - and never changes.

    1) Would it be beneficial if I change the kernel to "__global const int* in_array" ? And if so, why?

    2) Would it be OK to process one element per kernel - or would it be better to do a loop?

    Or in a few words: What's the best way to minimize the effects of memory latency here?

  2. #2
    Senior Member
    Join Date
    May 2010
    Location
    Toronto, Canada
    Posts
    845

    Re: const variable / memory latency

    1) Would it be beneficial if I change the kernel to "__global const int* in_array" ? And if so, why?
    I don't think it will make any difference since 'const' can be casted away. A compiler smart enough to notice that you didn't remove the 'const' with a cast will also be smart enough to notice that you never write into that array.

    Something that may help the compiler more is declaring the variables as '__global restrict int*'. Do this if you know that when you enqueue this kernel you are going to use a different buffer object for each of the arguments. 'restrict' in OpenCL works the same as in C99 if you are familiar with it.

    2) Would it be OK to process one element per kernel - or would it be better to do a loop?
    It depends on your definition of "better". If you mean "it executes faster" then the answer depends on how large is your NDRange. If you are going to execute a sufficiently large NDRange then doing a loop will improve performance because it will save the time it takes to start up and tear down a work-group after another. I don't think there will be any savings related to memory latency.

    Personally I would code for readability and let the driver/compiler decide about these kind of optimizations.
    Disclaimer: Employee of Qualcomm Canada. Any opinions expressed here are personal and do not necessarily reflect the views of my employer. LinkedIn profile.

  3. #3
    Member
    Join Date
    Oct 2010
    Location
    Vancouver, Canada
    Posts
    65

    Re: const variable / memory latency

    On some devices, for some sizes of data using __constant int* in_array might be an improvement.

Similar Threads

  1. copying a variable from host memory to device memory
    By shahsaurabh1990 in forum OpenCL
    Replies: 4
    Last Post: 03-26-2013, 01:10 AM
  2. Constant Memory latency
    By PaulS in forum OpenCL
    Replies: 3
    Last Post: 10-21-2009, 08:06 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •