Today, the Khronos® OpenCL™ Working Group is happy to announce the release of the finalized OpenCL 3.0 specifications, including a new unified OpenCL C 3.0 language specification, together with an early initial release of a Khronos OpenCL SDK to enable developers to quickly get up to speed using OpenCL.
In April 2020, the Working Group announced the release of the provisional OpenCL 3.0 spec at a presentation and panel session at IWOCL. Khronos placed as much information as possible about the specification and conformance tests onto the public GitHub to enable the developer community to provide input and feedback before the specifications and conformance tests were finalized. Since then, the Working Group has been diligently integrating internal and external input into the new unified OpenCL 3.0 specification and improve overall coverage of the conformance test suite, as well as ensure reliable operation of OpenCL 3.0’s new flexibility.
The Heart and Soul of OpenCL 3.0
OpenCL 3.0 makes the OpenCL ecosystem significantly more flexible by enabling hardware vendors to focus their resources on functionality that their customers need. This is achieved by slicing all functionality beyond OpenCL 1.2 into optional features that can be queried in the API, with macros to indicate whether optional OpenCL C language features are present. This flexibility sets the stage for new extensions that become widely useful to be incrementally integrated into new OpenCL core specifications for pervasive adoption.
Developers will find OpenCL 3.0 much easier to use through a unified specification that describes all versions of OpenCL in one document rather than separate specifications per version, making it simpler for developers to navigate as well streamlining specification fixes and clarifications. The unified OpenCL 3.0 specification also describes the rationale behind the specification's evolution.
The source of the OpenCL 3.0 specification is hosted on the Khronos GitHub for easy access, and the OpenCL Working Group welcomes community bug reports and pull requests to help improve the text of the specification to benefit everyone in the OpenCL community. Because the OpenCL specification is now unified across all versions, developers continuing to target older versions such as OpenCL 1.2 may also file specification bug reports and pull requests.
Last, but not least, the OpenCL 3.0 specification includes two new extensions:
- A query to return a universally unique identifier (UUID) for an OpenCL driver and device, which may be used to identify drivers and devices across processes or APIs.
- An Asynchronous DMA extension enabling ordered DMA transactions as first class citizens—ideal for Scratch Pad Memory based devices, which require fine-grained control over buffer allocation. This extension is the first of several significant upcoming advances in OpenCL to enhance support for embedded processors.
The OpenCL working group has already started work to add support of OpenCL 3.0 into clang/llvm upstream.
Moving Applications to OpenCL 3.0
Now OpenCL 3.0 is finalized; OpenCL 3.0 implementations will soon start shipping. It is straightforward to move existing applications running on any older version of OpenCL to OpenCL 3.0.
Applications using OpenCL 1.2 will run unchanged on any OpenCL 3.0 device as all OpenCL 1.2 functionality will work on any OpenCL 3.0 driver with no code changes.
OpenCL 2.X applications will also continue to work on OpenCL 3.0 with no code changes if the OpenCL 3.0 driver supports all the functionality used by the application. If you are running on a device that upgrades its drivers from 2.X to 3.0 it would be expected that all functionality will continue to be supported and so no application changes will be needed.
Applications that wish to run portably across multiple OpenCL 3.0 devices, and use OpenCL 2.X-level features, are strongly encouraged to query to ensure that functionality is available. All OpenCL 2.X API features can be queried, and OpenCL C 3.0 macros indicate whether optional language features are present.
Together with the OpenCL 3.0 specification, the Working Group has released an early initial Khronos OpenCL SDK that developers can use to easily begin OpenCL coding. The SDK is open sourced on the Khronos GitHub under the Apache 2.0 license and will be continuously updated and expanded.
This initial SDK release includes a new OpenCL Guide, Headers including vendor extensions, some small sample programs to illustrate how to use the SDK build system (with CI), and an ICD Loader that will soon support installable development layers.
C++ Kernels with OpenCL 3.0
With OpenCL 3.0, the OpenCL working group has transitioned from the original OpenCL C++ kernel language first defined in OpenCL 2.1 and recommends the community developed C++ for OpenCL open source front end compiler that provides improved features and compatibility with OpenCL C for those developers that wish to use C++17 for writing kernel programs.
C++ for OpenCL is supported by Clang and uses the LLVM compiler infrastructure. Its implementation in Clang can be tracked via the OpenCL Support Page. It enables developers to use most C++17 features in OpenCL kernels and generates code, through offline compilation, in the SPIR-V intermediate representation that is ingested by an increasing number of OpenCL implementations.
OpenCL C code is valid and fully compatible with the C++ for OpenCL compiler. This enables developers to use a consistent front-end compiler as they incrementally transition from pure OpenCL C to use C++17 features for their applications.
The OpenCL 3.0 cl_ext_cxx_for_opencl extension adds support for building programs written using C++ for OpenCL. It also enables applications to query the version of the language supported by the device compiler.
C++ for OpenCL generates SPIR-V 1.0 plus SPIR-V 1.2 where necessary. Experimental support was added in Clang 9 with bug fixes and improvements in Clang 10. You can check out C++ for OpenCL in Compiler Explorer.
The release of the OpenCL 3.0 specification and conformance tests sets the stage for conformant OpenCL 3.0 implementations to soon begin to ship. The Working Group can now also deliver regular maintenance updates to the specifications and drive the development of an active OpenCL roadmap where extensions will be used to prove new functionality before being added to future core specifications.
Some of the extensions already in the development pipeline include:
- Extended Subgroups
- Extended Debug Info
- External Memory Sharing
- Vulkan/OpenCL Interop
Some of the longer-term design directions being considered by the OpenCL Working Group are:
- Recordable Command buffers
- Machine Learning Primitives
- Indirect Dispatch
- Device Topology
- Unified Shared Memory
- Global Barriers
And this is all in parallel to the work being done to continuously increase deployment flexibility for OpenCL applications, for example through the Google clspv open source that can generate Vulkan SPIR-V shaders from OpenCL C kernel source code.
The OpenCL Working Group welcomes your continued feedback on GitHub for new feature requests and use cases, and which current optional features and extensions you would like to see become mandatory in the future – please let us know how we can make OpenCL more useful to you!