PDA

View Full Version : Problem with width and reference of element in array



luizdrumond
01-31-2013, 03:55 PM
Hi,

In this code bellow:

__kernel void sivia( __global unsigned int *FUNCOES_POR_KERNEL )

{
struct _intervalo lista[1344][TAM_LISTA];


for( int i = 0; i < 1344; i++ )
{

lista[i][0].inferior = -4;
lista[i][0].superior = 4;
lista[i][1].inferior = -4;
lista[i][1].superior = 4;



printf("\nT = %d [%.3f][%.3f][%.3f][%.3f] - %d", get_global_id(0),
lista[i][0].inferior,
lista[i][0].superior,
lista[i][1].inferior,
lista[i][1].superior,
0);
}

}

This code only works when my array lista is less then 134 for first dimension. If is more than 134 for first dimension, the code is send to be launch, but don't run. The printf for instance don't show anything. The code compile phase go well, all before run kernel's, in host, works fine. But, when clEnqueuNDRange is called, don't works and no errors appears. TAM_LISTA is 500.

But, if i change the i (highlighted in red ) for the constant 1343 for instance, the code works fine.

I need that this array is at least 1344 x 500.

Can anyone help please?

Thanks,

Luiz.

clint3112
02-01-2013, 12:58 AM
I downt know how your datatype is defined, but i think this will blast your memory. A memorydefinition at that point should go into shared or even pivate memory. Both are kind of small. Check this and see the spec where your variables go and how to split them. There is an errorcode for that i think, but ii dont know exactly

luizdrumond
02-01-2013, 11:43 AM
Hi,

At first, no errors occurs.

This data type is only two floats. I think that is very small and can be allocated in private memory. Am i wrong?


Very thanks for your help.

luizdrumond
02-01-2013, 11:52 AM
I forget to say that this example is running in CPU.

Indeed, i have to run in CPU, GPU is another code.

Is very strange that in CPU this errors happens.

Thanks again.

clint3112
02-04-2013, 03:31 AM
On the CPU, the private an local memory is very big, on the gpu it is much smaller. On my GTX680 local mem is 48k for all SPU's. So it's 4k per spu. Your array has 1344 * TAM_LISTA * 2 * 4 byte (or 10k * TAM_LISTA).
this will not fit i think.

luizdrumond
02-04-2013, 10:14 AM
This is not fit in CPU?

I have read some books that say that OpenCL mapping private memory in CPU in cache L1/registers. But, some tests that i made, i allocate in CPU more than maximum L1 size.

I don't know what i think about that now. hehehe


Thanks,

clint3112
02-05-2013, 02:11 AM
If you have VS and cuda + intel cl sdk, you could have a look at the nsight system window or opencl. that tells me the following:


First one is i7 12 core, second one is a gtx670
CL_DEVICE_LOCAL_MEM_SIZE 32768 49152
CL_DEVICE_LOCAL_MEM_TYPE Global Local
CL_DEVICE_MAX_CLOCK_FREQUENCY 3470 1564
CL_DEVICE_MAX_COMPUTE_UNITS 12 16