Khronos Public Bugzilla
Bug 504 - CL_OUT_OF_RESOURCES vs. CL_OUT_OF_HOST_MEMORY
CL_OUT_OF_RESOURCES vs. CL_OUT_OF_HOST_MEMORY
Status: NEW
Product: OpenCL
Classification: Unclassified
Component: Specification
1.1
All All
: P3 enhancement
: ---
Assigned To: Aaftab Munshi
OpenCL Working Group
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-07-31 23:58 PDT by Sean Settle
Modified: 2011-08-10 07:37 PDT (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Sean Settle 2011-07-31 23:58:29 PDT
The description of CL_OUT_OF_RESOURCES and CL_OUT_OF_HOST_MEMORY is word for word identical except for the words "host" and "device".  It would seem more fitting to depreciate CL_OUT_OF_RESOURCES and replace it with CL_OUT_OF_DEVICE_MEMORY, which is more informative as well.
Comment 1 Robert Quill 2011-08-01 02:19:24 PDT
Hi Sean,

I don't think that CL_OUT_OF_RESOURCES always means being out of device memory. Consider the description of clEnqueueNDRangeKernel which states (from the OpenCL 1.1 spec) that it may return:

"CL_OUT_OF_RESOURCES if there is a failure to queue the execution instance of kernel on the command-queue because of insufficient resources needed to execute the kernel. For example, the explicitly specified local_work_size causes a failure to execute the kernel because of insufficient resources such as registers or local memory."

Rob
Comment 2 Sean Settle 2011-08-01 04:34:05 PDT
Hi Rob,

As you pointed out with the description of clEnqueueNDRangeKernel in OpenCL 1.1 spec, CL_OUT_OF_RESOURCES at present doesn't always mean being out of device memory.  However, I would argue that the first half of the description--referring to registers and local memory--is redundant as it accurately falls under the device memory description.  However, the second part of the description:

"Another example would be the number of read-only image args used in kernel exceed the CL_DEVICE_MAX_READ_IMAGE_ARGS value for device or the number of write-only image args used in kernel exceed the CL_DEVICE_MAX_WRITE_IMAGE_ARGS value for device or the number of samplers used in kernel exceed CL_DEVICE_MAX_SAMPLERS for device,"

seems like it should be associated with the current description of CL_INVALID_KERNEL_ARGS:

"CL_INVALID_KERNEL_ARGS if the kernel argument values have not been specified."

I propose splitting CL_OUT_OF_RESOURCES into the argument part (under CL_INVALID_KERNEL_ARGS) and the memory part (under CL_OUT_OF_DEVICE_MEMORY):

"CL_INVALID_KERNEL_ARGS if the kernel argument values have not been specified, or if the number of kernel arguments with a specified attribute exceeds the maximum number of kernel arguments with that specified attribute.  Some examples would be the number of read-only image args used in kernel exceed the CL_DEVICE_MAX_READ_IMAGE_ARGS value for device or the number of write-only image args used in kernel exceed the CL_DEVICE_MAX_WRITE_IMAGE_ARGS value for device or the number of samplers used in kernel exceed CL_DEVICE_MAX_SAMPLERS for device."

"CL_OUT_OF_DEVICE_MEMORY if there is a failure to allocate resources required by the OpenCL implementation on the device."

In my opinion these revisions would be more intuitive and helpful while debugging.

Cheers,
Sean
Comment 3 Ofer Rosenberg 2011-08-10 07:12:16 PDT
(In reply to comment #2)
> Hi Rob,
> 
> As you pointed out with the description of clEnqueueNDRangeKernel in OpenCL 1.1
> spec, CL_OUT_OF_RESOURCES at present doesn't always mean being out of device
> memory.  However, I would argue that the first half of the
> description--referring to registers and local memory--is redundant as it
> accurately falls under the device memory description.  However, the second part
> of the description:
> 
> "Another example would be the number of read-only image args used in kernel
> exceed the CL_DEVICE_MAX_READ_IMAGE_ARGS value for device or the number of
> write-only image args used in kernel exceed the CL_DEVICE_MAX_WRITE_IMAGE_ARGS
> value for device or the number of samplers used in kernel exceed
> CL_DEVICE_MAX_SAMPLERS for device,"
> 
> seems like it should be associated with the current description of
> CL_INVALID_KERNEL_ARGS:
> 
> "CL_INVALID_KERNEL_ARGS if the kernel argument values have not been specified."
> 
> I propose splitting CL_OUT_OF_RESOURCES into the argument part (under
> CL_INVALID_KERNEL_ARGS) and the memory part (under CL_OUT_OF_DEVICE_MEMORY):
> 
> "CL_INVALID_KERNEL_ARGS if the kernel argument values have not been specified,
> or if the number of kernel arguments with a specified attribute exceeds the
> maximum number of kernel arguments with that specified attribute.  Some
> examples would be the number of read-only image args used in kernel exceed the
> CL_DEVICE_MAX_READ_IMAGE_ARGS value for device or the number of write-only
> image args used in kernel exceed the CL_DEVICE_MAX_WRITE_IMAGE_ARGS value for
> device or the number of samplers used in kernel exceed CL_DEVICE_MAX_SAMPLERS
> for device."
> 
> "CL_OUT_OF_DEVICE_MEMORY if there is a failure to allocate resources required
> by the OpenCL implementation on the device."
> 
> In my opinion these revisions would be more intuitive and helpful while
> debugging.
> 
> Cheers,
> Sean

Hi Sean,

My opinion is that your proposed change has wrong semantics for the required error type.
Putting the second part of the description under “CL_INVALID_KERNEL_ARGS” is wrong – as the kernel arguments are not invalid, they just contain values that the specific device which the queue is defined on can’t execute due to lack of resources. (so it should be under CL_OUT_OF_RESOURCES). 

I agree with Rob that “out of resources” for a device is much more than memory. For example, on GPU devices, there can be numerous resources which are limited beyond memory – for example number of programs/kernels can be bounded by some HW which holds pointers to the code sections.

We need two separated error codes so that the implementation may be able to distinguish between issues related to the Host side of OpenCL (OpenCL runtime), and issues related to the Device side. Based on this distinguish, the application may take steps to solve the error (for example, free unused memory object or unused programs).

If any, the error code that we used for Host is misleading and should have been “CL_OUT_OF_HOST_RESOURCES”. As even on the Host, we can go out of resources (for example, number of concurrent threads in the OS can be limited). However, we have legacy that we’re dragging from 1.x, and we can’t change names now… 

Regards,
Ofer.
Comment 4 Sean Settle 2011-08-10 07:37:51 PDT
Hi Ofer,

I now understand why it's "RESOURCES" and not just "MEMORY", and you've convinced me that "CL_OUT_OF_HOST_RESOURCES" and "CL_OUT_OF_DEVICE_RESOURCES" would be more fitting.  I also see the legacy issue, however, since it's just a matter of adding some #DEFINEs, couldn't these be included while keeping the legacy error codes "CL_OUT_OF_RESOURCES" and "CL_OUT_OF_HOST_MEMORY" defined in terms of the more descriptive error codes?

Cheers,
Sean