Results 1 to 5 of 5

Thread: __global vs __constant qualifier in OpenCL

  1. #1
    Junior Member
    Join Date
    Aug 2012
    Posts
    18

    __global vs __constant qualifier in OpenCL

    I want an array variable to have a program scope.

    One way I can do this by passing it as a function pointer throughout the program, which might be complex when we have multiple functions reading/writing this array variable.
    Second way to do this, is to have a global variable having program scope. As per the OpenCL specification, Global variables are declared in the program source with the __constant qualifier and are accessed as read-only variables.

    //I am writing one sample program to demonstrate my problem:
    __constant uint arr[2] = {0, 0}; // an array of unsigned integer
    void func1 (uint tmp)
    {
    for(int i = 0; i < 2; i ++)
    arr[i] = tmp+i;
    }
    void func2(uint tmp)
    {
    for(int i = 0; i < 2; i++)
    tmp = arr[i];
    }
    __kernel void demoKernel(__global uint *input,
    __global uint *output)
    {
    uint index = get_global_id(0);
    func1(input[index]);
    func2(output[index]);
    }

    when i compiled this i got the following error:

    tmp/OCLrnlEIO.cl", line 5: error: expression must be a modifiable lvalue
    arr[i] = tmp+i;
    I searched in google for this error, i found that it is because of type of "arr" is array of 2 length (it is not a pointer).

    So my questions are:

    1. What is reason for this error and How I can fix this ?
    2. secondly, my requirement is to not only to read the array, but also write on it, so how should I use __constant qualifier for that which is read-only variable?

    Thanks in Advance !!

  2. #2
    Senior Member
    Join Date
    Oct 2012
    Posts
    111
    1. As its name suggests, a __constant variable cannot be modified. So a line such as 'arr[i] = tmp+i' is incorrect.

    2. A variable defined at program scope can be either __constant or __global. So you could declare arr as __global.
    However, your program is written so that arr should store a different value for each work-item, so its correct address space is __private, and arr cannot be declared __private at program scope.

    You must declare arr as a __private array inside demoKernel and pass it as an argument to func1 and func2.

  3. #3
    Junior Member
    Join Date
    Aug 2012
    Posts
    18
    Quote Originally Posted by utnapishtim View Post
    1. As its name suggests, a __constant variable cannot be modified. So a line such as 'arr[i] = tmp+i' is incorrect.

    2. A variable defined at program scope can be either __constant or __global. So you could declare arr as __global.
    However, your program is written so that arr should store a different value for each work-item, so its correct address space is __private, and arr cannot be declared __private at program scope.

    You must declare arr as a __private array inside demoKernel and pass it as an argument to func1 and func2.
    Thanks for helping me out !!!
    One more thing i want to know, lets assume number of elements in arr is 44 then would it be efficient to use address space __local in place of __private?
    I thought of using __local address space for this but I am not sure whether shared memory is going to help me in this case, since i know that shared memory is only useful if we need to access data more than once, either within the same thread or from different threads within a block.

    Please provide an insight here to get better performance in this scenario.

  4. #4
    Senior Member
    Join Date
    Oct 2012
    Posts
    111
    __local memory is used to share data among work-items of the same work-group, which implies that your algorithm can be split into blocks.
    In your simple case, arr contains temporary data relative to a work-item, not to a work-group, so its natural memory space is __private.
    Private memory is the fastest and GPUs generally have more private memory than local memory.

  5. #5
    Junior Member
    Join Date
    Aug 2012
    Posts
    18
    Quote Originally Posted by utnapishtim View Post
    __local memory is used to share data among work-items of the same work-group, which implies that your algorithm can be split into blocks.
    In your simple case, arr contains temporary data relative to a work-item, not to a work-group, so its natural memory space is __private.
    Private memory is the fastest and GPUs generally have more private memory than local memory.
    Thank you,
    i used __private address space and got result little faster compare to __local

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •