PDA

View Full Version : clFinish is returning CL_INVALID_COMMAND_QUEUE



barraqueiro
10-07-2012, 09:30 AM
Hello everyone,

For the past 3 months I've been trying to parallelize a code using the graphics card but so far I've being unsuccessful. Unfortunately my code is too large, so I cannot display it here.

So this is the problem: when I considered all the code, although "clFinish" did not return any error, when I read the variables back from the GPU to the CPU, I obtained zeros eveywhere (as if nothing had happned).

I then went to invetigate why that happened and so I removed portions of the code. I removed pretty much everything, leaving only initialization and smaller algebraic operations. In this case, everyhting worked as planned: clFinish did not return any error and the variables that I read back from the GPU were what they are expected to be.

Since everything seemed to be working fine, I then started to include again portions of the code that was removed, starting by restoring a loop with an inner loop. Only simple algebraic operations are done inside the loops (there is no memory access violations: I've tested the code on CPU). Now, after adding these loops, clFinish returns CL_INVALID_COMMAND_QUEUE, and that is what I cannot understand!

Does anyone have an idea of what may be causing clFinish to return that error? I am confident that the're no mistake on the loop that I've restored, since I tested it on CPU.

I can copy the code to whoever may be interested.

Thank you,
Joćo

chippies
10-08-2012, 08:19 AM
It could be that the kernel failed to launch for some reason, e.g. insufficient resources on the device. That would happen if your work group dimensions specify more threads than can fit on the GPU due to limited registers or shared memory.

notzed
10-08-2012, 06:58 PM
Assuming the queue was properly setup, then it's probably still just a crash in the code. It could be in any kernel before clFinish() (or other synchronising command) is called and not necessarily the last one.

Checking on the CPU - whilst it helps - does not guarantee the code is fine on a gpu as the execution environment is so different.

If you've spent that long and haven't got it working I suggest trying to break out small parts at a time and get them working - so you're getting familiar with working code, not just bashing your head against broken stuff.