Results 1 to 5 of 5

Thread: Local memory allocation

  1. #1
    Junior Member
    Join Date
    Jun 2010
    Posts
    11

    Local memory allocation

    Hi everyone,
    I've read somewhere (some forum I cannot recall right now) that allocating local ("shared" in nvidia cuda nomenclature) memory statically like below should be avoided since it's implementation dependend:
    Code :
    __local float  s_elData[32];
    The dynamic allocation using kernel args and clSetKernelArg should be used instead:
    Code :
    __kernel void kernelP1(
    	__local float* s_elData,
            //...
    and (in host code):
    Code :
    clSetKernelArg(kernel, 1, 32 * sizeof(float), NULL);

    Unfortunately when I'm using the latter method my register usage increases from 14 to 19 - no other change in code, just the way of allocation. So I rather stick to the former - static - method of allocation - is it safe or really should be avoided?
    Thanks!

  2. #2
    Senior Member
    Join Date
    Nov 2009
    Posts
    118

    Re: Local memory allocation

    Where do you define this variable?
    At program scope I guess?

  3. #3
    Junior Member
    Join Date
    Jun 2010
    Posts
    11

    Re: Local memory allocation

    Quote Originally Posted by matrem
    Where do you define this variable?
    At program scope I guess?
    You mean when allocating statically? In kernel. Like this:
    Code :
    __kernel void K(
        //.. kernel args
    ){
        //definition of s_el
        __local float s_el[32];
        //.. download data from global to s_el, make computations in parallel, store results from s_el back to global
    }

    So kernel scope. I need it only to download some data from global memory to it and then perform a lot of computations in the kernel and store the results back to global memory.
    It's working (on nvidia opencl implementation) and the reg consumption is lower then if I allocated dynamically with kernel arguments and clSetKernelArg (The s_el array is always constant size so I don't need dynamic allocation). Is this way of defining variables in local mem all right?

  4. #4
    Senior Member
    Join Date
    Nov 2009
    Posts
    118

    Re: Local memory allocation

    I guess it's right.
    Perhaps it's a bug in nVidia implementation.

  5. #5
    Senior Member
    Join Date
    May 2010
    Location
    Toronto, Canada
    Posts
    845

    Re: Local memory allocation

    Quote Originally Posted by pplaszew
    Hi everyone,
    I've read somewhere (some forum I cannot recall right now) that allocating local ("shared" in nvidia cuda nomenclature) memory statically like below should be avoided since it's implementation dependend
    You are doing things fine. Declaring a local variable at kernel scope is perfectly legal. See section 6.5.2 of the CL 1.1 spec; there's even an example. Don't worry about that.
    Disclaimer: Employee of Qualcomm Canada. Any opinions expressed here are personal and do not necessarily reflect the views of my employer. LinkedIn profile.

Similar Threads

  1. Replies: 6
    Last Post: 02-28-2013, 04:59 PM
  2. copy from global memory to local memory..problem
    By phoebe0105 in forum OpenCL
    Replies: 3
    Last Post: 06-03-2010, 02:14 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •