I have written OpenCL kernel in which there are no work groups, only single work items in a 2D range. There are two read/write operations in kernel and I need a function like barrier/mem_fence that ensures that all work items read updated values after that sync function. In particular
work-items writing to global memory;
work-items reading updated values from global memory;
I see that barrier() and mem_fence() functions only gives this syncronization within work groups, but since I dont have work groups but only single work items, how can I achieve this synconization? or should I have to make two kernels first writing and second reading from that same global memory ?