Results 1 to 5 of 5

Thread: local declaration inside the kernel

  1. #1

    local declaration inside the kernel

    What happens when you declare a variable local inside the kernel itself. I have some code that works, but i want to make it faster. So, I want to copy the global input array into a local tempArray. I can create the variable, but as soon as I write to it, the kernel fails during runtime, and I cant get errors. more specifically:

    Code :
    __kernel void simple(
    	global const float* input1,
    	global float* output,
    	constant float* input2,
    	private int numData,
    	private int numData2)
    {
    	int index = get_global_id(0);
     
    	local float tempBuffer[90000];
    //as soon as I add this next line in, whether as async_work_group_copy, or using index,
    //it fails and returns wonkey values.
     
    	tempBuffer[index] = index; ----------------------------broken
     
    write_mem_fence(CLK_GLOBAL_MEM_FENCE); // so we can use later
    i have it outputting an array of values that tells me whether or not it worked, and after that line it doesnt. How can I fix this?

  2. #2

    Re: local declaration inside the kernel

    I even tried changing the output array to a local variable, and it spit out random values, or the values that had been last run, but not the ones from the current run

  3. #3
    Junior Member
    Join Date
    Nov 2010
    Posts
    14

    Re: local declaration inside the kernel

    Hey gamingdrake,

    when you write to local memory, the barrier

    Code :
    write_mem_fence(CLK_GLOBAL_MEM_FENCE); // so we can use later

    does not work. The CLK_GLOBAL_MEM_FENCE flag synchronizes *global* memory accesses. You have to use CLK_LOCAL_MEM_FENCE.

    i have it outputting an array of values that tells me whether or not it worked, and after that line it doesnt.
    I don't understand you here.

    However, what I think you mean, is, that you try to download the tempBuffer to the host. However, the host cannot access local memory of the GPU. You first have to write back the content of tempBuffer to global memory. Then the host can access the data.

    I hope that helped

  4. #4

    Re: local declaration inside the kernel

    Thank you for your answer.
    When I say:
    i have it outputting an array of values that tells me whether or not it worked, and after that line it doesnt.
    it means that I have output coming back from the GPU. In the first case, before it breaks, I tell it to output 32 to every value in "global float* output". After it breaks, I change it to output 77777 into every value, and it still outputs 32.

    As for the CLK_GLOBAL_MEM_FENCE, I have tried using LOCAL, but it does the same thing. But because my output is a global value, I should still be able to write to it and access the values, whether or not the fence works. It seems that local anything kills my kernel, unless its of size 10x10 or something small.

  5. #5
    Junior Member
    Join Date
    Nov 2010
    Posts
    14

    Re: local declaration inside the kernel

    Hey Gamingdrake,

    the host cannot access local memory. Therefore, you cannot use a local memory array to up- and download data.

    See your other topic (viewtopic.php?f=28&t=4110) for a little more details.

Similar Threads

  1. Replies: 4
    Last Post: 08-06-2012, 01:18 AM
  2. How Can i work with matrix inside a kernel?
    By luizdrumond in forum OpenCL
    Replies: 2
    Last Post: 09-13-2011, 09:01 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •