PDA

View Full Version : Array initialization failed



pelangi15
06-12-2011, 11:47 PM
Dear experts,

I need some help and guidance of the following problem.

I have created the following kernel code to initialize an array of 76800 elements to value of INT_MAX.
const char init_arrays_cl[] = " \
__kernel void init_arrays \
( \
__global int* input_1 \
,int value \
,uint length \
) \
{ \
const uint index = get_global_id(0); \
\
if (index < length) { \
input_1[index] = value; \
} \
} \
";

However, the output array is only set 19200 elements to the desired value while the rest seems uninitialized. No error message gets printed out.

Here are parts of my main program:
1) worksize = 76800;
mem1 = clCreateBuffer(context, CL_MEM_READ_WRITE, worksize, NULL, &error);
if (error != CL_SUCCESS){
printf("clCreateBuffer fails for mem1\n");
}

2) cl_kernel k_cfg=clCreateKernel(prog, "init_arrays", &error);
if (error != CL_SUCCESS){
printf("clCreateKernel fails %d\n",error);
}

3) error = clSetKernelArg(k_cfg, 0, sizeof(cl_mem), &mem1);
if (error != CL_SUCCESS){
printf("clSetKernelArg fails for k_cfg mem1: %d\n",error);
}

4) g_worksize = 76800;
error=clEnqueueNDRangeKernel(cq, k_cfg, 1, NULL, &g_worksize, NULL, 0, NULL, NULL);
if (error != CL_SUCCESS){
printf("clEnqueueNDRangeKernel fails\n"); }

error=clEnqueueReadBuffer(cq, mem1, CL_TRUE, 0, worksize, img_disp_left, 0, NULL, NULL);
if (error != CL_SUCCESS){
printf("clEnqueueReadBuffer fails for mem1 %d\n", error);
}


By the way, 19200 is equal to 76800 divided by 4.
I suspect something to do with int and char settings...
Any help or advice is appreciated.

david.garcia
06-13-2011, 04:37 AM
error=clEnqueueReadBuffer(cq, mem1, CL_TRUE, 0, worksize, img_disp_left, 0, NULL, NULL);

You are only reading "worksize" bytes. What you want to do is this:


error=clEnqueueReadBuffer(cq, mem1, CL_TRUE, 0, worksize * sizeof(cl_int), img_disp_left, 0, NULL, NULL);

pelangi15
06-13-2011, 06:38 PM
Hi David,

Thanks for coming to the rescue again.

So it means that I must think in terms of 1 byte (type char or unsigned char) transfer between the GPU and host, and adjust accordingly to the data type's size I need to use. Is this correct?

For clarity sake, in addition to your suggestion, I think I need to create a buffer with same size, i.e.
mem1 = clCreateBuffer(context, CL_MEM_READ_WRITE, worksize * (cl_int), NULL, &error);

Finally, this initialization of arrays, is it better to be done in GPU or CPU? What's your thoughts on this?

david.garcia
06-13-2011, 07:14 PM
So it means that I must think in terms of 1 byte (type char or unsigned char) transfer between the GPU and host, and adjust accordingly to the data type's size I need to use. Is this correct?


Yes, that's right. All the APIs in OpenCL are based on bytes as far as I remember.



For clarity sake, in addition to your suggestion, I think I need to create a buffer with same size, i.e.
mem1 = clCreateBuffer(context, CL_MEM_READ_WRITE, worksize * (cl_int), NULL, &error);


Yes, absolutely! I missed that.


Finally, this initialization of arrays, is it better to be done in GPU or CPU? What's your thoughts on this?

I would use whatever device is going to use that data later. What you want to avoid is forcing OpenCL to copy data between devices.