Results 1 to 2 of 2

Thread: How OpenCL __private address space is mapped on GPU?

Threaded View

  1. #1
    Junior Member
    Join Date
    Aug 2012
    Posts
    18

    How OpenCL __private address space is mapped on GPU?

    OpenCL spec says that "All variables inside a function (including __kernel functions), or passed into the function as arguments are in the __private or private address space. Variables declared as pointers are considered to point to the __private address space if an address space qualifier is not specified".

    I have few questions on this:
    1. I want to know how these variables mapped to GPU' resources (i.e. to registers, or local memory or global memory)? Actually I know that these variables are stored in GPU's registers which is on-chip. Is that correct or is it stored in off-chip memory?

    2. As in the given example below:

    void func1(__global uint *input, __global uint *output, __private uint *arr)
    {
    //some stuff
    }
    __kernel void demoKernel(__global uint *d_input,
    __global uint *d_output,
    int d_maxSize)
    {
    long index = get_global_id(0);

    __private uint arr[12];
    func1(d_input[index], d_output[index], arr);
    }

    I am using Nvidia' Kepler K20 GPU card in Ubuntu with OpenCL platform version 1.1 .
    To find the number of resource usage info in the above example, I used "-cl-nv-verbose -cl-nv-maxrregcount=122" flag while building OpenCL kernel.
    The resource usage info is given below:

    a. when arr was declared as __private address space
    ptxas info : Compiling entry function 'demoKernel' for 'sm_35'
    ptxas info : Function properties for demoKernel
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
    ptxas info : Used 38 registers, 332 bytes cmem[0] : 249

    b. when arr was declared as __local address space
    ptxas info : Compiling entry function 'demoKernel' for 'sm_35'
    ptxas info : Function properties for demoKernel
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
    ptxas info : Used 44 registers, 48+0 bytes smem, 332 bytes cmem[0] : 266

    how number of registers usage got decreased when i use __private address space in place of __local? I think that when we use __private address space, number of register usage should be increased but that didn't happen above.

    Thanks !!
    Last edited by Gopal_HC; 08-27-2013 at 04:40 AM.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •