Looking at clCreateCommandQueue from the OpenCL specification,
it seems clear that one can create multiple command queues from the same device.
Why does exist this possibility? Is it to increase the performances in a specific case?
Is there any case where multiple command queues (bounded to the same device)
can have better performance than a single command queues? one per compute unit?