OpenCL Tooling Task Sub Group (TSG) is actively contributing to the LLVM compiler infrastructure project and is determined to bring first-class support for OpenCL and SPIR-V to LLVM. While the latest release of Clang brought the long-awaited support for the OpenCL 3.0 standard, C++ for OpenCL 2021 kernel language, and the SPIR-V generation interface utilizing an external tool llvm-spirv from the SPIRV-LLVM-Translator repository, the work on the native GlobalISel-based SPIR-V backend continues at full speed. SPIR-V updates and many other exciting changes in the SPIR-V and OpenCL world will be discussed in depth at the upcoming 2022 LLVM Developers’ Meeting.
The OpenCL 3.0 specification and SDK for heterogeneous parallel computation are regularly updated with bug fixes, improved documentation, and functional enhancements. The OpenCL 3.0.12 maintenance release on 15 September 2022, included significant new functionality including command buffer enhancements, system layer support, and maintenance updates.
In this EE Times Europe article, Neil Trevett describes how the need for graphics and compute acceleration in embedded markets is growing. Cameras and sensor arrays are increasingly central to many use cases in diverse industries, ranging from automotive to industrial, and are generating increasingly rich data streams that require sophisticated processing. At the same time, advanced user interfaces are being developed using high-quality 3D graphics and even augmented-reality technology. However, the need to deploy accelerated processing, combined with the complexities of safety-critical certification, has created a confusing landscape of processors, accelerators, compilers, APIs, and libraries. That has driven up integration costs for embedded accelerators, which in turn has constrained innovation and time-to-market efficiencies.
Open standards have an important role in helping hardware and software vendors navigate this complex technology environment. Acceleration standards for the embedded market can enable cross-platform software reusability, decouple software and hardware development for easier deployment and integration of new components, provide cross-generation reusability, and facilitate field upgradability. Such standards reduce costs, shorten time to market, and lower the barriers to using advanced techniques such as inferencing and vision acceleration in compelling real-world products.
Khronos Group President, Neil Trevett, shares how open standards have an important role mitigating the complexities of safety-critical certification in a confusing landscape of processors, accelerators, compilers, APIs, and libraries, that drive up integration costs for embedded accelerators, which in turn has constrained innovation and time-to-market efficiencies.
PoCL is a portable open source (MIT-licensed) implementation of the OpenCL standard. It likely supports the minimal v3.0 feature set (official conformance stamp not yet applied for). In addition to being an easily portable multi-device (truly heterogeneous) open-source OpenCL implementation, a major goal of this project is improving interoperability for diversity of OpenCL-capable devices by integrating them to a single centrally orchestrated platform. Another key goal is to enhance performance portability of OpenCL programs across device types utilizing runtime and compiler techniques.
Upstream PoCL currently supports various CPUs, NVIDIA GPUs via libcuda and ASIPs (experimental, see: http://openasip.org). It is also known to have multiple (private) adaptations in active production use.
With the release of Portable Computing Language (POCL) 3.0-RC1, there is now initial support for OpenCL 3.0 running on CPUs with LLVM 14+. In addition, LLVM/Clang 14 support and improved tracing, scripts for converting traces into Chromium trace visualizer format are new major features.
OpenCL 3.0.11 adds two new extensions and continues the regular release cadence for specification bug fixes and clarifications. The cl_khr_subgroup_rotate extension enables an OpenCL kernel to rotate values among work-items in a subgroup for increased data exchange efficiency in many algorithms. The cl_khr_work_group_uniform_arithmetic extension enables an OpenCL kernel to use new work-group scan and reduction operators which can boost the performance of many use cases—and is ideal to accelerate C++ scan and reduction functions in SYCL 2020 implementations targeting OpenCL as a backend.
Join us to help drive the evolution of Machine Learning acceleration standards. ML developers lament the growing fragmentation in the ML ecosystem. Khronos knows that open and royalty-free standards can play an essential role in reducing fragmentation, reducing costs, and providing the industry participants the opportunity to grow. Based on feedback from previous summit and discussions, Khronos is creating a coalition of interested parties to meet the needs of the ML community for hardware acceleration.
The release of the OpenCL 3.0 specification was a significant milestone for this open standard for low-level heterogeneous parallel programming, creating a pervasive baseline that can be cleanly extended with new functionality requested by developers. But a strong open standard ecosystem is much more than just an API document and Khronos is making significant investments to improve the OpenCL developer experience. Read on to discover the latest updates to the OpenCL SDK and what is coming on the SDK roadmap!
LLVM recently released Clang 14. New OpenCL features include the ability to generate a SPIR-V binary, support for OpenCL 3.0 and more…
OpenCL developers can try out new provisional OpenCL/Vulkan Interop functionality today with NVIDIA’s latest drivers and downloadable sample code.
In Basis Universal’s v1.16 release, it focuses on smaller code size, less 3rd party dependencies (just Zstd), OpenCL support, faster ETC1S encoding, and fully multithreading/parallel processing.
- ETC1S encoding is now approximately 30% faster. We added more optimizations to the encoder’s backend and more SSE optimizations to the frontend.
- Optional OpenCL support has been added to the ETC1S encoder.
GPUinfo.org enables the community to build extensive databases of Khronos API driver capabilities by uploading reports from diverse end-user devices and platforms. With more than 20,000 device reports available for Vulkan, OpenGL, and OpenGL ES across Windows, Linux, Android, Mac OSX, and iOS, GPUInfo.org has become a widely used resource for developers to gain detailed insights into deployed hardware support for features they wish to use, including devices for which they don’t have direct access. As a brand new addition, GPUInfo.org now offers a client application and server-side database for the OpenCL™ standard for cross-platform, heterogeneous parallel programming at opencl.gpuinfo.org.
C++ for OpenCL 2021 Kernel Language Documentation Released for Developer Feedback
C++ for OpenCL is a community-based, open-source C++ kernel language for OpenCL that combines full OpenCL C with most features of C++17, implemented in Clang and LLVM. Using the new ‘year of release’ versioning scheme, the draft documentation for C++ for OpenCL 2021 language is now released on GitHub for developer review and feedback. C++ for OpenCL 2021 is fully compatible with OpenCL 3.0 as the same features are made optional in both.