Results 1 to 7 of 7

Thread: dot product

  1. #1
    Junior Member
    Join Date
    Feb 2012
    Posts
    20

    dot product

    Hi,
    I would like to know if someone knows how to implement the dot product in a way that is efficient, at least not slower or not much slower than doing it in a CPU. My idea is to implement the method of conjugated gradient with sparse matrix and the matriz vector multiplication is faster in the gpu but the hole method is slower and I guess the reason is the dot product!!
    Thanks!!

    Pablo

  2. #2
    Senior Member
    Join Date
    Aug 2011
    Posts
    271

    Re: dot product

    You probably want a parallel prefix sum, or some sort of parallel reduction step. It's a very common opencl operation so you should be able to find plenty of papers and some code for it (ALL the sdk's will have examples of it).

    A search for 'parallel reduction' shows a lot of relevant stuff, or try 'parallel prefix sum'.

  3. #3
    Junior Member
    Join Date
    Feb 2012
    Posts
    20

    Re: dot product

    Thanks!! I will search for it, the idea would be a kernel to multiply position per position and sum all, am I right? the problem is that a time ago I tried just doing the first part and only that took longer than the hole dot product in the CPU, maybe I made a mistake that time.
    Thanks again!!

    Pablo

  4. #4
    Senior Member
    Join Date
    Aug 2011
    Posts
    271

    Re: dot product

    It's only worth if if the data is resident on the gpu and staying there within a loop.

    From your other posts it looks like you're moving data to/from the cpu a lot within your main loop: it will be pointless if you're doing this - and no matter what you do you'll be massively underutilising any discrete gpu.

  5. #5
    Junior Member
    Join Date
    Feb 2012
    Posts
    20

    Re: dot product

    I changed the program and I do all the operations in kernels now even the escalar operations. I guess that something that currently Im not doing but in a better version should be necessary to control the values so if they do not vary more than a tolerance, end the loop (now Im just runing the loop N times so I dont need to send data from GPU to CPU and viceversa, except in the begining and end).

    Pablo

  6. #6
    Senior Member
    Join Date
    Aug 2011
    Posts
    271

    Re: dot product

    Quote Originally Posted by mustang
    I changed the program and I do all the operations in kernels now even the escalar operations. I guess that something that currently Im not doing but in a better version should be necessary to control the values so if they do not vary more than a tolerance, end the loop (now Im just runing the loop N times so I dont need to send data from GPU to CPU and viceversa, except in the begining and end).

    Pablo
    Ahaah, good. Yeah dynamic termination is tricky.

    I don't know the best answer, but some of the things i've tried:

    0) just hard-code it, that works for some problems ...
    a) perform a reduction of termination state checking on the gpu, and put it in a small buffer, which can then be read by the cpu quickly.
    b) batch up a bunch of loops at a go, so this check isn't done too often.
    c) copy the small state buffer to another on the gpu, then read it synchronously on the cpu, but read it using a separate queue and wait for it on another thread, thus avoiding a synchronous device-stalling round-trip to check the state.

  7. #7
    Junior Member
    Join Date
    Feb 2012
    Posts
    20

    Re: dot product

    thanks, anyway the idea up to the moment is trying to make the method to run faster in the GPU and maybe then do the dynamic termination!!

    Pablo

Similar Threads

  1. dot product of two vectors using reduction
    By shahsaurabh1990 in forum OpenCL
    Replies: 1
    Last Post: 02-01-2013, 09:15 AM
  2. Well defined ways of detecting product ID
    By codedivine in forum Suggestions for next release
    Replies: 1
    Last Post: 10-13-2012, 10:11 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •