Results 1 to 5 of 5

Thread: using local memory

  1. #1
    Junior Member
    Join Date
    Feb 2010
    Posts
    10

    using local memory

    I've got a simple test kernel that writes into local memory, and then copies the data to an output buffer in global memory:

    Code :
    __kernel void foo( __global float *debug_data, __local float *shared_segment)
    {
       // works
       int tid = get_local_id(1);
       *(shared_segment + tid) = tid;
       *(debug_data + tid) = *(shared_segment + tid);

    where it's a two dimensional grid of 16K by 256. The local work group size is 1x256. For the shared memory size, I've got 16K/256 = 64 bytes a thread in a workgroup. This behaves the way I think it should: I get back 256 floats in debug_data, with values starting at 0 and going to 255.

    But now if I add a line to my test kernel:

    Code :
    __kernel void foo( __global float *debug_data, __local float *shared_segment)
    {
       // doesn't work
       int tid = get_local_id(1);
       *(shared_segment + tid) = tid;
       *(shared_segment + tid + 256) = tid;       // new line
       *(debug_data + tid) = *(shared_segment + tid);

    I get back 24 floats of valid values in debug_data, followed by zeroes. I'm completely stumped -- as far as I'm concerned, shared_segment should have room for 4096 4-byte values, and the index of my last shared_segment write is 511. I've double-checked my kernel arguments, and I think it's correct: a size_t set to ((16*1024)/256) for the size parameter, followed by a NULL.

    Can anyone point out to me what I'm misunderstanding about allocating and using local memory?

  2. #2
    Senior Member
    Join Date
    Sep 2002
    Location
    Santa Clara
    Posts
    105

    Re: using local memory

    What did you specify as the size of the local memory when calling clSetKernelArg for argument 1 which is your shared_segment variable? This is the size that will be allocated from local memory for shared_segment.

  3. #3
    Senior Member
    Join Date
    Nov 2009
    Posts
    118

    Re: using local memory

    to be able to write at address shared_segment + 511, you must have, at least, allocated a size of sizeof(cl_float) * 512.

    So (16*1024)/256 (that I don't understand) is not enough...

  4. #4
    Junior Member
    Join Date
    Feb 2010
    Posts
    10

    Re: using local memory

    Yes, I see how I goofed ...

    I was setting the kernel argument to ((16*1024)/256) because there are 256 threads per work group, and I thought the argument should be the amount of local memory _per thread_. If I change it to the size that I want for the entire work group -- on my board, 16K less some small amount of memory (<32B) for system use -- then it works fine.

    BTW, at one point I did try allocating the entire 16K; but when it failed with an out of resources error, I didn't realize it was because of the system byte allotment, and figured (incorrectly) that it needed to be a per-thread size. D'oh!

    In the Nvidia bitonic sample, I see that local memory isn't being passed through kernel arguments at all, but rather allocated wthin the kernel itself. When would you allocate local memory via a kernel argument, and when would you allocate it internally in the kernel itself, as in the bitonic sample? Is there any functional difference between the two methods at all?

    Thanks to both of you for your responses. I appreciate your willingness to help out someone on the steep end of the GPU learning curve!

  5. #5
    Senior Member
    Join Date
    Nov 2009
    Posts
    118

    Re: using local memory

    Using defines to setup array size need kernel recompilation, when you want to change this size.
    Perhaps it permits better optimization...

Similar Threads

  1. Replies: 6
    Last Post: 02-28-2013, 04:59 PM
  2. Replies: 9
    Last Post: 11-15-2010, 03:15 AM
  3. copy from global memory to local memory..problem
    By phoebe0105 in forum OpenCL
    Replies: 3
    Last Post: 06-03-2010, 02:14 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •