Results 1 to 6 of 6

Thread: quicksort on program scope : -36 error

  1. #1

    quicksort on program scope : -36 error

    Hi,

    I'm working on image processing and I try to do median filter where we use sort.

    My median.cl file structure is as following :

    Code :
     
    #define TYPE int
     
    #define FILTERSIZE 1000
    #define  MAX  FILTERSIZE
     
    void quickSort(int *array, int length)//non-recursive algorithm
    {
       int  piv, begin[MAX], end[MAX];
        int i=0 ;
        ....
        //quick sort
    }
     
    void Sort(TYPE *tab, int tailleTableau){ //bubble sort
     
        int i, k;
        TYPE temp;
     
        for(k = 0; k < (tailleTableau/2)+1; k++) {
            for(i = 0; i < tailleTableau-1; i++) {
     
                if(tab[i] > tab[i+1])
                {
                    temp=tab[i];
                    tab[i]=tab[i+1];
                    tab[i+1]=temp;
                } 
     
            }
        }
    }
     
    __kernel void median(__global const TYPE *inputBuffer,
                         __global TYPE *outputBuffer,
                         __global const int *filterIndex,
                         const int filterIndexSize,
                         const int InputBufferSize)
    {
        unsigned int x = get_global_id(0);
        TYPE pixel;
        unsigned int i;
        long offsetIndex;
        TYPE neighborhood[FILTERSIZE];
     
        pixel = 0;
     
        for(i=0; i< filterIndexSize; i++)
        {
            offsetIndex  = x + filterIndex[i];
     
            if(offsetIndex >= 0 && offsetIndex < InputBufferSize )  // in image buffer bound
            {
                neighborhood[i] = inputBuffer[offsetIndex];
            }
        }
     
    //    Sort(neighborhood, filterIndexSize);
        quickSort(neighborhood, filterIndexSize);
     
        pixel = neighborhood[(int)(filterIndexSize/2)];
     
        outputBuffer[x] = pixel;   
    }

    Some little median values works, some others not. I have clEnqueueReadBuffer : CL_INVALID_COMMAND_QUEUE(-36) error.
    How can I fixe this issue? Is it an allocation faillure on GPU? Too big allocation on sort function?

    Note that sort algorithms are tested and work well. Kernel works well too without call to sort function.

    Thanks.

  2. #2
    Senior Member
    Join Date
    May 2010
    Location
    Toronto, Canada
    Posts
    845

    Re: quicksort on program scope : -36 error

    If you are completely certain that the quicksort algorithm is correctly implemented, then it looks like your system fails to allocate so much private memory due to the large value of FILTERSIZE.

    The good news is that you don't need to sort the whole array to find the median value. That's overkill

    See http://en.wikipedia.org/wiki/Selection_algorithm. Notice how quickselect in particular can be implemented without recursion very easily and runs in average O(n) time and O(1) space. Plus since it's a variation on quicksort it should be easy to understand.
    Disclaimer: Employee of Qualcomm Canada. Any opinions expressed here are personal and do not necessarily reflect the views of my employer. LinkedIn profile.

  3. #3

    Re: quicksort on program scope : -36 error

    Thanks! I had never heard this algorithm.

    I do it with quick selection algorithm and it's better :
    - before, with 3x3x2 filter, that was ok, but 3x3x3 filter crash with clEnqueueReadBuffer : -36 error without error log
    - now, 5x5x5 :: work,
    7x7x7 :: clEnqueueReadBuffer : -36, without error log
    10x10x10 :: say clearly clEnqueueNDRangeKernel : -48 , log : ptxas error : Entry function 'median' uses too much local data (0x485a bytes, 0x4000 max)

    -> I ask myself (and you ) the following questions :
    - Should I find a way to decompose neighborhood datas ?
    - Should I pass a global neighborhood variables by clSetKernelArg ?
    - Is there other and/or better way to resolve this kind of issue?
    - 5x5x5 is enough ?

  4. #4
    Senior Member
    Join Date
    May 2010
    Location
    Toronto, Canada
    Posts
    845

    Re: quicksort on program scope : -36 error

    Should I pass a global neighborhood variables by clSetKernelArg ?
    That's the best way I can see to make it work. You could try to store 'neighbourhood' in local memory, but since each work-item needs its own copy of the data so that it can modify (sort) it to find the median, the amount of local memory available in your system is going to limit your work-group sizes.

    For example, if you have 16KB of local memory and you want to compute a 10x10x10 convolution in floating point you can only have 4 work-items per work-group. That's clearly too little, so this way of using local memory is not a good idea.

    What yo could do is load a NxNxN block of data into local memory, then have each work-item copy a 10x10x10 piece from it into a global neighbourhood variable and sort it over there. That should reduce quite a bit the amount of bandwidth you need.
    Disclaimer: Employee of Qualcomm Canada. Any opinions expressed here are personal and do not necessarily reflect the views of my employer. LinkedIn profile.

  5. #5

    Re: quicksort on program scope : -36 error

    What yo could do is load a NxNxN block of data into local memory, then have each work-item copy a 10x10x10 piece from it into a global neighbourhood variable and sort it over there. That should reduce quite a bit the amount of bandwidth you need.
    Let me ask some questions, and correct me if I am wrong :
    -> If I have (NxNxN)xM +2 buffer size(for image), how to manage the 2 bytes left? How to determine N? Is it the nearest and low value to 'local memory maximum size'?
    -> Is NxNxN a multiple of 10x10x10?
    -> If I copy neighborhoods values of a work-item for compute(sort in my case) from local memory to global memory and each work-item doing the same, will there be conflict ?
    TY!

    P.S. : I never work with local memory, and maybe I should start.
    P.P.S. : My local memory is 16KB if needed.

  6. #6
    Senior Member
    Join Date
    May 2010
    Location
    Toronto, Canada
    Posts
    845

    Re: quicksort on program scope : -36 error

    Is it the nearest and low value to 'local memory maximum size'?
    Right.

    Is NxNxN a multiple of 10x10x10?
    Not necessarily. Notice that two neighbouring work-items almost read the same NxNxN block of pixels (the only difference is N pixels on one side).

    If I copy neighborhoods values of a work-item for compute(sort in my case) from local memory to global memory and each work-item doing the same, will there be conflict ?
    You have to be careful to make sure that two work-items do not attempt to write to the same location in local memory. You can use the built-in function async_work_group_copy() to move data from global to local memory easily and quite efficiently.
    Disclaimer: Employee of Qualcomm Canada. Any opinions expressed here are personal and do not necessarily reflect the views of my employer. LinkedIn profile.

Similar Threads

  1. program error
    By jai in forum OpenCL
    Replies: 4
    Last Post: 10-05-2012, 04:00 AM
  2. Program Build Error
    By dlw in forum OpenCL
    Replies: 5
    Last Post: 06-28-2011, 12:34 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •