Results 1 to 9 of 9

Thread: Suggestions for next release of OpenCL

  1. #1
    Newbie
    Join Date
    Sep 2013
    Posts
    1

    Exclamation Spec confusion regarding convert_ functions

    Refer OpenCL Spec 1.2

    Section 6.2.3
    Explicit conversions may be performed using the
    convert_destType(sourceType)
    suite of functions. These provide a full set of type conversions between supported types (see
    sections 6.1.1, 6.1.2 and 6.1.3) except for the following types: bool, half, size_t,
    ptrdiff_t, intptr_t, uintptr_t, and void

    Section 6.2.3.1
    Conversions are available for the following scalar types: char, uchar, short, ushort,
    int, uint, long, ulong, float, and built-in vector types derived therefrom.

    There are datatypes like
    double,
    image2d_t
    image3d_t
    image2d_array_t
    image1d_t
    image1d_buffer_t
    image1d_array_t
    sampler_t
    event_t
    Which are covered in section 6.2.3, but not in 6.2.3.1. What is expected of these datatypes?

  2. #2
    Senior Member
    Join Date
    Oct 2012
    Posts
    119
    double is handled like float.

    The other data types (image, sampler, event) are neither scalar types nor vector types, so section 6.2.3.1 does not apply to them.
    They can't be cast to another type. They are simply considered opaque types.

  3. #3
    Newbie
    Join Date
    Jan 2014
    Posts
    1

    Accept (int) comparison for select() for all scalar types.

    The relational select() function is very handy for vectorization and mimics the ternary ()?: op. But, the supported types in scalar and vector modes for doubles (and halfs) is inconsistent with the relational comparison functions such as isgreater().

    Scalar prototypes:
    Code :
    int isgreater (double a, double b);
    double select (double a, double b, long cmp);

    Vector prototypes:
    Code :
    longn isgreater (doublen a, doublen b);
    doublen select (doublen a, doublen b, longn cmp);

    The scalar isgreater() (and similar) functions match the c99 math.h prototypes and return int for all datatypes. But, select() only accepts long for double (and short for half). This requires an explicit cast in most (all?) implementations and makes for some headaches when building type-independent code. That is, we can't cleanly write

    Code :
    T select (T a, T b, isgreater(a,b));

    and expect it to work with double and doublen. This same issues occurs with halfs. I have to wrap an #ifdef statement to distinguish scalar and vector types.

    Code :
    #if (__VectorSize == 1)
       // ()?: version
       // double result = (isgreater(a,b)) ? b : a;
       double result = select (a, b, (long) isgreater(a,b));
    #else
       double2 result = select (a, b, isgreater(a,b));
    #endif

    I propose that select() accept the datatype output of the relational functions in both scalar and vector modes. That is, accept (int) for all datatypes in scalar modes and accept the equivalent bit-masks in vector mode.

  4. #4
    Junior Member
    Join Date
    Mar 2014
    Posts
    10

    Would like to target system with intel cpu and amd gpu

    Hello List,
    I would like to be able to load-balance my algorithm onto both intel cpu and amd gpu
    at the same time.

    Now, Intel SDK supports intel hardware, and AMD SDK supports AMD hardware.

    How can I develop a solution that targets both platforms concurrently?

  5. #5
    Junior Member
    Join Date
    Sep 2013
    Posts
    16

    SPIR version number

    Make SPIR version number the same as the OpenCL version it belongs too.
    Reduces potential confusion.

  6. #6
    Newbie
    Join Date
    Nov 2014
    Posts
    2

    Asynchronous memory release

    Releasing temporary buffers in the middle of a chain of kernels executing asynchronously is currently cumbersome. It requires either a synchronization with the device to guarantee that all pending operations using the buffer have finished, or a clumsy event callback on a marker with wait list (or even worse through a native kernel if the device supports it).
    The drawback of the first is that it introduces needless synchronization just to release memory, and the disadvantage of the second besides the horrible syntax is the fact that there is no guarantee as to when the callback will be invoked.

    I think it would be useful to have a function such as clEnqueueReleaseMemObject, which can be pushed onto a queue with the traditional wait list and attached event. It would do exactly the same as clReleaseMemObject with the added advantage that it can be woven into a complex task graph to release the memory as soon as it is not needed.

    Proposed function:

    Code :
    cl_int clEnqueueReleaseMemObject ( cl_command_queue command_queue,
                                       cl_mem memobj,
                                       cl_uint num_events_in_wait_list,
                                       const cl_event *event_wait_list,
                                       cl_event *event )

    Has this been already discussed?

  7. #7
    Junior Member
    Join Date
    Jul 2011
    Location
    Bristol, UK
    Posts
    23
    It's not clear to me what the problem is. There is no requirement that all pending operations using a buffer complete before you can release it - the buffer will only be destroyed when the reference count is 0 and all commands that use it have completed.

    Can you give an example of the sequence of operations that you are trying to perform, and where you would like to release the buffers?

  8. #8
    Newbie
    Join Date
    Nov 2014
    Posts
    2
    Quote Originally Posted by jprice View Post
    and all commands that use it have completed.
    Right, my bad, I missed that part. I based my assumption of the note of clSetKernelArg (5.7.2):
    A kernel object does not update the reference count for objects such as memory, sampler objects specified as argument values by clSetKernelArg, Users may not rely on a kernel object to retain objects specified as argument values to the kernel.
    and the definition of reference counting from the spec (2):
    After the reference count reaches zero, the objectís resources are deallocated by OpenCL.
    So I thought that temporary buffers could only be safely released after synchronization. But the doc of clReleaseMemObject indeed says that the object stays alive event with a ref count of zero as long as it is used by an object in the command queue:
    After the memobj reference count becomes zero and commands queued for execution on a command-queue(s) that use memobj have finished, the memory object is deleted.
    Thanks for pointing that out.

  9. #9
    Administrator khronos's Avatar
    Join Date
    Jun 2002
    Location
    Montreal
    Posts
    59

    Suggestions for next release of OpenCL

    We're restructuring and cleaning up our forums. This will be the official thread for everyone to post their suggestions for the next version of OpenCL. We have moved the most recent suggestions into this thread already. We look forward to seeing more suggestions.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •