Since we cannot use memcpy in OpenCL, i am wondering if there
is a similar function available that can be used to copy chunks of
data from __global to __private (or to __local) inside a kernel.
For example say I wish to copy 10 elements from global memory to
__private memory (per thread). I do not wish to make a loop like:
Code :for (int i=0; i<n_elements..... ...
How is this generally achieved in OpenCL?
The purpose is to get a list of data into each thread. I am making a raytracer
where I need to grab a list of surface data contained within each grid cell
(or tree node if I use that).