Results 1 to 3 of 3

Thread: can doing a loop calculation on a gpu be faster than a cpu ?

  1. #1
    Junior Member
    Join Date
    Apr 2010
    Posts
    14

    can doing a loop calculation on a gpu be faster than a cpu ?

    Hi, I'm working on my research about opencl programming.
    I'm still new to gpgpu and having a really hard time understanding how it works.
    can anybody here could help with my questions?

    1.Does having a lot of kernels slows down the execution's speed ?
    2.Can doing a lot of loops calculation on a gpu be faster than on a cpu ?

    let's say, I have this sample code :
    Code :
    __kernel void calcu_h(__global float* sum_h, __global float* w_hi, __global int* unit_i, __global float* unit_h)
    {
     
    	int i,h,p;
     
     
        for(p=0; p<26;p++){
        	for(h=0;h<100;h++){
        		for(sum_h[(p*100)+h]=0.0,i=0;i<=100;i++)
        			sum_h[(p*100)+h]+=w_hi[(h*100)+i]* (float)unit_i[(p*100)+i];
        		unit_h[(p*100)+h] = 1.0/(1.0+(float)exp(-(sum_h[(p*100)+h])));
        	}
     
        	unit_h[(p*100)+h]=1.0;	
      	}
     
    }
    what's the best way to break these loops ?

  2. #2
    Junior Member
    Join Date
    Sep 2010
    Posts
    3

    Re: can doing a loop calculation on a gpu be faster than a c

    1.Does having a lot of kernels slows down the execution's speed ?
    Most of the times a single kernel call would be enough to perform a complete task in parallel, but it all depends on the type of your computation or what you are trying to do. It should be quite clear that to call a kernel, you need to set its arguments, enqueue it for execution and then read back the results from the device. I personally don't think having many kernel calls in your application would be efficient. Better to give your kernel some general data (for example pointer to a chunk of memory) and then perform the access calculations in your kernel with the help of functions like get_local_id()

    2.Can doing a lot of loops calculation on a gpu be faster than on a cpu ?
    I think having many nested loops in your kernel is not a good idea. You have to remember that your kernel will be executed for every instance of 'work item' in parallel. Having many nested loops will surely increases the overhead and slows down the parallel execution as a whole. Try to eliminate unnecessary loops for more efficiency. Write your algorithms in a smarter way and try loop-unrolling techniques.

  3. #3
    Junior Member
    Join Date
    Apr 2010
    Posts
    14

    Re: can doing a loop calculation on a gpu be faster than a c

    Thanks alot for your reply!
    it made things clearer !

Similar Threads

  1. CPU faster in vector addition than GPU
    By SabinManiac in forum OpenCL
    Replies: 5
    Last Post: 10-13-2011, 12:14 PM
  2. Faster on CPU than on the GPU
    By vijaykiran in forum OpenCL
    Replies: 1
    Last Post: 08-12-2010, 10:44 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •