Results 1 to 5 of 5

Thread: Why no async_work_group_copy for halfs in core specification

  1. #1

    Why no async_work_group_copy for halfs in core specification

    Half data types are in the core OpenCL 1.0 specification as a "data storage format". However, async_work_group_copy is only extended to work on halfs in the fp16 extensions. Isn't async_work_group_copy just a bit-wise copy? Why does it need special hardware support to work on halfs?

    Is it dangerous for me to just cast my half pointers to short pointers so I can use async_work_group_copy?

    Thanks,
    Brian

  2. #2
    Senior Member
    Join Date
    Nov 2009
    Posts
    118

    Re: Why no async_work_group_copy for halfs in core specification

    It seems OpenCL devices don't really work on half if they don't support pf16 extension :

    Quote Originally Posted by 6.1.1.1 The half data type
    The half data type can only be used to declare a pointer to a buffer that contains half values
    In that case you have to use vload_half to work on a real float from a half.

  3. #3

    Re: Why no async_work_group_copy for halfs in core specification

    Quote Originally Posted by matrem
    In that case you have to use vload_half to work on a real float from a half.
    Yup, which I'm doing in the inner loop of my code to load floats out of shared memory. It was just odd that I couldn't bitwise copy the half values from global to shared memory. Anyway, I just cast the half pointer to a short pointer and then I can use async_work_group_copy on the data that way, it appears to work fine on my Tesla card. Though it does feel hackish.

    Halfs are really nice since my application is bounded by shared memory size, I can effectively double my occupancy by using halfs in shared memory.

  4. #4
    Senior Member
    Join Date
    Sep 2002
    Location
    Santa Clara
    Posts
    105

    Re: Why no async_work_group_copy for halfs in core specification

    You are correct that async_work_group_copy should have been allowed for half type in the core specification instead of associating it with the cl_khr_fp16 extension. This is most likely an oversight. Thanks for catching it. In any case you can make this work by casting the pointers to a short. This is not however an ideal or a clean solution but something that does work.

  5. #5
    Senior Member
    Join Date
    Sep 2002
    Location
    Santa Clara
    Posts
    105

    Re: Why no async_work_group_copy for halfs in core specification

    Also note that only the scalar half type is in the core spec. So async_work_group_copy could only be supported for half type and not the vector variants of the half. The vector variants are enabled by the cl_khr_fp16 extension.

Similar Threads

  1. Replies: 3
    Last Post: 06-24-2012, 07:01 AM
  2. Make 3D image writes part of the core specification
    By hellomrjack in forum Suggestions for next release
    Replies: 7
    Last Post: 01-21-2011, 02:00 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •