Results 1 to 2 of 2

Thread: __local atomic in opencl

  1. #1

    __local atomic in opencl

    I want to do an operation like dot product, just to exemplify the use of opencl atomic.
    The dot, as everybody knows is the sum of the multiplications of values, must be calculated like:

    sum = x1*x2 + y1*y2 + z1*z2

    I'm interesed just in the sum value. I thought I could do a "__local int sum" to gather the operations.
    In this program I'm just incrementing the value, this could exemplify the way of gathering stuff...

    Code :
    #pragma OPENCL EXTENSION cl_khr_int64_base_atomics : enable
     
    __kernel void test(global int * vec, const int numOperations)
    {
        __local int sum;
        if (get_global_id(0) < numOperations) {
            atom_inc(&num, 1);
        }
    }

    But this doesn't work, it throw a crash.

    What am I doing wrongly?

  2. #2
    Senior Member
    Join Date
    Aug 2011
    Posts
    271

    Re: __local atomic in opencl

    Quote Originally Posted by orochimaster
    I want to do an operation like dot product, just to exemplify the use of opencl atomic.
    The dot, as everybody knows is the sum of the multiplications of values, must be calculated like:

    sum = x1*x2 + y1*y2 + z1*z2

    I'm interesed just in the sum value. I thought I could do a "__local int sum" to gather the operations.
    In this program I'm just incrementing the value, this could exemplify the way of gathering stuff...

    Code :
    #pragma OPENCL EXTENSION cl_khr_int64_base_atomics : enable
     
    __kernel void test(global int * vec, const int numOperations)
    {
        __local int sum;
        if (get_global_id(0) < numOperations) {
            atom_inc(&num, 1);
        }
    }

    But this doesn't work, it throw a crash.

    What am I doing wrongly?
    You're using a 32 bit atomic, and only enabling the 64-bit ones. What is the actual 'crash', you're getting, afaict it should just be failing to compile.

    Is an integer dot product much use for anything practical? Apart from anything else, you will very quickly get overflows (it could be as little as one element).

    Besides, integer multiplies are slow on a GPU (sometimes an order of magnitude slower than float): unless you have some specific fixed-point algorithm in mind just use floats.

    Which means you cannot use atomics for this: search on 'parallel sum'/'parallel prefix sum', 'parallel reduction', etc - this problem is widely studied and fairly easy to implement (once you grok a few basics).

Similar Threads

  1. Replies: 8
    Last Post: 02-08-2012, 03:51 AM
  2. Atomic operations in OpenCL 1.0
    By yulia in forum OpenCL
    Replies: 7
    Last Post: 02-07-2011, 10:22 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •