Results 1 to 3 of 3

Thread: why use barrier?

  1. #1
    Junior Member
    Join Date
    Oct 2009
    Posts
    8

    why use barrier?

    Hi all, i'm looking at the nbody example that comes with snowleopard (specifically the one in QC 4.0) and I'm trying to understand why the barrier command is used in the kernel (full code below).
    I understand it is to synchronize the different work-items from the same work-group, but what does that benefit in this case?


    Code :
    __kernel void integrateNBodySystem(__global const float4 *oldPos, __global const float4 *oldVel, 
    				  			  __global float4 *newPos, __global float4 *newVel, float deltaTime, 
    		 		  			  const int numBodies,  const float damping, const float softening, __local float4 *sharedPos) 
    {
    	int 				index = get_global_id(0),
    					tidx = get_local_id(0),
    					blockDimX = get_local_size(0),
    					tile = 0,
    					i, j;
    	float4 			pos = oldPos[index],
    					acc = make_float4(0.0f, 0.0f, 0.0f, 0.0f),
    					vel;
    	float 			mass = pos.w,
    					softeningSq = softening * softening;
     
    	for(i = 0; i < numBodies; i+= blockDimX, tile++) {
    		sharedPos[tidx] = oldPos[tile * blockDimX + tidx];
     
            	barrier(CLK_LOCAL_MEM_FENCE);
     
            	for(j = 0; j < blockDimX; ) {
            		acc += bodyBodyInteraction(sharedPos[j++], pos, softeningSq); 
    #if LOOP_UNROLL >= 1
            		acc += bodyBodyInteraction(sharedPos[j++], pos, softeningSq); 
    #endif
    #if LOOP_UNROLL >= 2
            		acc += bodyBodyInteraction(sharedPos[j++], pos, softeningSq); 
            		acc += bodyBodyInteraction(sharedPos[j++], pos, softeningSq); 
    #endif
    #if LOOP_UNROLL >= 4
            		acc += bodyBodyInteraction(sharedPos[j++], pos, softeningSq); 
            		acc += bodyBodyInteraction(sharedPos[j++], pos, softeningSq); 
            		acc += bodyBodyInteraction(sharedPos[j++], pos, softeningSq); 
            		acc += bodyBodyInteraction(sharedPos[j++], pos, softeningSq); 
    #endif  	
            }        
            barrier(CLK_LOCAL_MEM_FENCE);
    	}
     
    	//deltaTime *= 5.;
     
    	vel = oldVel[index];
     
      	vel += acc * deltaTime;
       	vel *= damping;
     
        	pos += vel * deltaTime;
        	pos.w = mass;
     
        	newPos[index] = pos;
        	newVel[index] = vel;
    }

  2. #2

    Re: why use barrier?

    Each work item calculates a value in the array sharedPos (note sharedPos is local and indexed by tidx which is the local id) which is then used in the core of the loop. The first barrier is to ensure that all work items have got to that point before the values are used. The second is to make sure that all work items have finished using it before the loop restarts and it's overwritten.

    Hope that helps.

  3. #3
    Junior Member
    Join Date
    Oct 2009
    Posts
    8

    Re: why use barrier?

    Hi thanks for the reply. I think I kind of understand but not fully confident if I can use that technique to my advantage :S apart from the opencl specifications, can you recommend any reading / tutorials that will demonstrate how to use local memory efficiently?

Similar Threads

  1. Barrier
    By sher in forum OpenCL
    Replies: 2
    Last Post: 03-29-2012, 01:55 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •