PDA

View Full Version : Async kernel execution and data copy



shiftreduce
11-10-2009, 05:48 AM
Hi all,

Is it possible to copy data to the gpu while a kernel is executing? As there is no concurrent kernel execution so far this would be better than nothing...

dbs2
11-10-2009, 06:02 AM
If you have a command queue that is out-of-order, then OpenCL is free to do this if the hardware and runtime support it. Additionally, in-order queues could legally do this if they can prove that the memory model will remain consistent. However, this is an optimization that is entirely up to the vendor, so there is no specific way (short of specifying an out-or-order command queue) to do this.

shiftreduce
11-10-2009, 07:18 AM
I have CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE as one of the queue properties, so I think I'm good.

So what you're saying is that there's no way of controlling it and it depends on the implementation? I'm confused, because in CUDA this can be done but you need to define a couple of streams and, as far as I knows, it only works with mapped memory...

dbs2
11-10-2009, 10:47 AM
That's correct. The OpenCL model is inherently asynchronous (hence the clEnqueue... commands) so if the implementation is well optimized and the hardware supports it you should get that automatically.

sacsp
11-12-2009, 02:20 AM
How to check if the hardwords support it ? Sorry just curious and I'm such a newbie...

dbs2
11-12-2009, 11:32 AM
You'd have to ask the vendors what the particular device supports. I know many GPUs have at least some DMA capability, but whether that is used by OpenCL is a completely different issue. Given how new OpenCL is I doubt this optimization has been done yet.