Results 1 to 7 of 7

Thread: Allocate array in a kernel of length known only at runtime

  1. #1
    Junior Member
    Join Date
    Mar 2012
    Posts
    29

    Allocate array in a kernel of length known only at runtime

    An array arr is passed to the kernel with some data. Inside the kernel I need a temporary array of the same size as arr. How can I allocate it? I tried to pass the size of arr from host to the kernel and to allocate an array with
    Code :
    float tmp[arrSize]
    but the compiler doesn't accept this because arrSize is not a compile time constant. malloc() is not supported by OpenCL C.

    How can I create a temporary array of the same size as an existing one inside a kernel?

  2. #2
    Junior Member
    Join Date
    Apr 2012
    Posts
    6

    Re: Allocate array in a kernel of length known only at runti

    My understanding is(and I'm no expert) that you can't. I think you can create a buffer in local(shared) memory from host code and pass this as an argument to your kernel.

    clSetKernelArg(kernel, 0, 16*sizeof(float), NULL);

    The fact that the last argument here is null will tell the compiler to allocate space for 16 floats in fast local memory.

  3. #3
    Senior Member
    Join Date
    Sep 2002
    Location
    Santa Clara
    Posts
    105

    Re: Allocate array in a kernel of length known only at runti

    One option is to create the temporary array that is the same size as arr being used inside the kernel as a cl_mem object i.e. using clCreateBuffer and pass it also as an argument to kernel.

  4. #4
    Senior Member
    Join Date
    Aug 2011
    Posts
    271

    Re: Allocate array in a kernel of length known only at runti

    Quote Originally Posted by MaximS
    An array arr is passed to the kernel with some data. Inside the kernel I need a temporary array of the same size as arr. How can I allocate it? I tried to pass the size of arr from host to the kernel and to allocate an array with
    Code :
    float tmp[arrSize]
    but the compiler doesn't accept this because arrSize is not a compile time constant. malloc() is not supported by OpenCL C.

    How can I create a temporary array of the same size as an existing one inside a kernel?
    option1: recompile the code using a #define to match the problem size, or some problem size limit.

    option2: use local memory, and manually make sure each work item is working on it's own pool (use local work id + index* arrSize] as the index to avoid bank conflicts. only works if you have a limited amount which will fit. You need to allocate arrsize * local work size so that each item has it's own block.

    option3: pass in global memory big enough to fit allocated on the host, and manually make sure each work item is working on it's own pool. Probably use similar indexing to above so that accesses are coalesced. i.e. you need to allocate arrSize * global work size so that each work item has it's own block.

    option 1 is the easiest if you know the problem is bounded by some reasonable upper limit.

    option 3 is the closest to how a runtime implements 1 internally - 'private arrays' are just private ranges of global memory.

  5. #5
    Junior Member
    Join Date
    Apr 2012
    Posts
    6

    Re: Allocate array in a kernel of length known only at runti

    option 3 is the closest to how a runtime implements 1 internally - 'private arrays' are just private ranges of global memory.
    So, an array defined within a kernel like

    float temp[12];

    is actually allocated in global (slow) memory?

  6. #6
    Senior Member
    Join Date
    Aug 2011
    Posts
    271

    Re: Allocate array in a kernel of length known only at runti

    Quote Originally Posted by Mark Flamer
    option 3 is the closest to how a runtime implements 1 internally - 'private arrays' are just private ranges of global memory.
    So, an array defined within a kernel like

    float temp[12];

    is actually allocated in global (slow) memory?
    It depends on how you access it. If you use fixed indices or at least indices which are known at compile time, it should be registerised if it can fit the register file.

    If you use dynamic indices or it is too big then yes, it goes into global memory - there's no where else for it to go.

    The only real private memory a gpu has is registers. The closest next thing is local memory which can be used in a private way if you address it properly. I almost always use local memory in this way if I need an internal private array and I have space.

    This information is in the various programming guides form the vendors and has been mentioned on forums before. e.g. see section 4.9, page 4-43 of the amd app programming guide 1.3f - much of that is representative of all gpu hardware.

  7. #7
    Junior Member
    Join Date
    Apr 2012
    Posts
    6

    Re: Allocate array in a kernel of length known only at runti

    Thanks for the explanation

Similar Threads

  1. Replies: 3
    Last Post: 07-04-2010, 02:57 AM
  2. Array size at runtime
    By jeffheaton in forum OpenCL
    Replies: 2
    Last Post: 05-12-2010, 03:40 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •