I have an 2D OpenCL image. Using a kernel, i modify the values of the first and second channels (x and y), the third and fourth channels are ignored.
Now, I have to transfer the data from the device to the host.
- transfer the whole image in a cl_float4* (and consider only .x and .y values)
- transfer a part of the image in a cl_float2*
My guess is that the first option should be faster (but I'm not sure) but we use then 2 times more memory on the host than we actually need.
Conversly, the second option should be longer but we use the exact amount of memory we need.
Any advice here?