PDA

View Full Version : What's the deal with clEnqueueWriteBufferRect?



vincentfpgarcia
01-29-2013, 06:08 AM
Hi,

I have a buffer (float*) that represents an image of let's say 120x120 pixels.
I create on the device a buffer that represents an image of 100x100.
What I want to do is to take the center of the first image (host) to fill the device one.
clEnqueueWriteBufferRect seems to be the perfect solution...

Let's have a look on the documentation of clEnqueueWriteBufferRect.



cl_int clEnqueueWriteBufferRect(
cl_command_queue command_queue,
cl_mem buffer,
cl_bool blocking_write,
const size_t buffer_origin[3],
const size_t host_origin[3],
const size_t region[3],
size_t buffer_row_pitch,
size_t buffer_slice_pitch,
size_t host_row_pitch,
size_t host_slice_pitch,
void *ptr,
cl_uint num_events_in_wait_list,
const cl_event *event_wait_list,
cl_event *event)


No comments about command_queue, buffer, blocking_write, buffer_row_pitch, buffer_slice_pitch, host_row_pitch, host_slice_pitch, ptr, num_events_in_wait_list, event_wait_list, and event. Now, because the device buffer will be entirely filled, we must have :



size_t buffer_origin[3] = {0, 0, 0};


Only 2 parameters remain : host_origin and region. What the documentation says about these parameters is :

host_origin : The (x, y, z) offset in the memory region pointed to by ptr. For a 2D rectangle region, the z value given by host_origin[2] should be 0. The offset in bytes is computed as host_origin[2] * host_slice_pitch + host_origin[1] * host_row_pitch + host_origin[0].

region : The (width, height, depth) in bytes of the 2D or 3D rectangle being read or written. For a 2D rectangle copy, the depth value given by region[2] should be 1.

So, in my case I should use :



size_t input_offset[3] = {10, 10, 0};
size_t region[3] = {100*sizeof(float), 100*sizeof(float), sizeof(float)};


Of course, it doesn't work. Let's focus on the input_offset parameter.
If we consider their formula, it is said that "the offset in bytes is computed as host_origin[2] * host_slice_pitch + host_origin[1] * host_row_pitch + host_origin[0].". Since host_slice_pitch and host_row_pitch are given in bytes we must have host_origin[2] and host_origin[1] as numbers and host_origin[0] in bytes! No? Otherwise the offset is wrong. Or host_slice_pitch and host_row_pitch must be given not in bytes. Why the parameters are not consistent?

Now, the region parameter. I agree that the region[0] must be in bytes so that we know how many bytes we have to copy.
However, region[1] and region[2] must be given in "number of rows" and "number of slices", otherwise how to know how many line we have to copy? Anyway, if region[1] and region[2] are given in bytes, the program crashes. Again, why the parameters are not consistent?

Using these remarks, I have



size_t input_offset[3] = {10*sizeof(float), 10, 0};
size_t region[3] = {100*sizeof(float), 100, 1};


and it works perfectly.

So my question is what am I doing wrong? If I'm not doing anything wrong, don't you think that the documentation is wrong then?

Thanks :)

clint3112
01-30-2013, 12:56 AM
First region definition is not correct i think.

Doc tells you, with 2D region[2] should be 1 but you inserted sizeof(float), which is 4 on most systems

vincentfpgarcia
01-30-2013, 02:33 AM
You're right, region[2] must be 1. However, this doesn't answer my question: even with region[2]=1, the problem still remains.

utnapishtim
01-30-2013, 02:51 AM
Specification states that:

"region defines the (width in bytes, height in rows, depth in slices) of the 2D or 3D rectangle"

so the definition:

size_t region[3] = {100*sizeof(float), 100, 1};

makes perfect sense.

vincentfpgarcia
01-30-2013, 02:55 AM
Yeap, that's it, you're right. My solution was correct finally. Maybe the online documentation should be updated. Thanks for your help.