In my kernel I'm processing NxN chunks of an image in a buffer. Each chunk will produce a single result so each work group is a single chunk and each work item within the group modifies a single __local variable with atomic updates. To get the right value, I need to initialize the result value to zero.

This is my approach:

Code :
	local uint result;
	result = 0;
	barrier(CLK_LOCAL_MEM_FENCE);
 
	... work ... (atomic_xyz(&result, ...))
 
	barrier(CLK_LOCAL_MEM_FENCE);
	if (get_local_id(0) == 0 && get_local_id(1) == 0) {
		global_chunk_results[...] = result;
	}


My question is whether this is the most efficient way initializing a single local variable used by all work items within a group.