Results 1 to 3 of 3

Thread: Required atomic built-in functions

  1. #1
    Senior Member
    Join Date
    Mar 2011
    Location
    Seoul
    Posts
    118

    Required atomic built-in functions

    I know a research group that built a abstracted layer upon OpenCL to treat multiple identical GPU devices as a single device (assuming they all also have the same PCIe bandwidth). However, I can‘t imagine how they could make it work if any atomic functions are required to be complient. Are some atomics required? If they were all optional then I can think of how to do it without too much pain and it could be very useful.

  2. #2
    Senior Member
    Join Date
    Mar 2011
    Location
    Seoul
    Posts
    118

    Re: Required atomic built-in functions

    I think I just found part of my answer. In OpenCL 1.0 basic atomic functions were optional, but as of OpenCL 1.1 they are required. If that is wrong please correct me.

    Well, atomic functions pretty much ruins my hope of making an abstract device that integrates several similar devices. I could only think of how to do it without any atomic functions.

    Any reasoning behind making it obligatory in OpenCL 1.1?

  3. #3
    Senior Member
    Join Date
    Aug 2011
    Posts
    271

    Re: Required atomic built-in functions

    Quote Originally Posted by sean.settle
    I think I just found part of my answer. In OpenCL 1.0 basic atomic functions were optional, but as of OpenCL 1.1 they are required. If that is wrong please correct me.

    Well, atomic functions pretty much ruins my hope of making an abstract device that integrates several similar devices. I could only think of how to do it without any atomic functions.

    Any reasoning behind making it obligatory in OpenCL 1.1?
    OpenCL is a software abstraction: you can implement atomics however you want, they just have to honour the contract.

    e.g. for many devices you could break the kernels up into atomically bounded sections and run the kernel parts separately and then synchronise on the host.

    I'm not saying it would be efficient, but I mean what do you really expect to be able to do anyway? The atomic operations require very specific specialised hardware in order to run fast, and without that you will have no choice but to resort to *host-based* software.

    Global atomics are so slow on AMD hardware for example I wouldn't use them except for very rarely-executed code (i.e. it's possible calling the host already), so a high overhead is already expected. But they have global counters implemented in hardware to get around that ...

    Re your earlier query there's nothing to say a research project implemented the full specification in the first place. It is possible to do without atomics entirely, at a cost of memory and extra processing steps.

    Amalgamating different hardware with different performance characterstics will be a challenge! Often different hardware requires a different coding approach, it runs at a different speed, and so on: managing all that scheduling and keeping the memory close to the right kernels will be difficult.

Similar Threads

  1. Built-in Functions: Work-Item Functions
    By sean.settle in forum Suggestions for next release
    Replies: 7
    Last Post: 11-17-2011, 07:39 AM
  2. Atomic functions for ushort arrays
    By nachovall in forum OpenCL
    Replies: 0
    Last Post: 03-16-2011, 04:02 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •