Opencl tagged news

IWOCL & SYCLcon is the premier workshop of leading academic and industrial experts to present, discuss and learn about applying OpenCL and SYCL addressing issues faced in High Performance Computing across a wide range of application domains. This is an excellent opportunity to contribute and participate in this workshop through a paper, talk, special session / tutorial, or poster. This workshop will include invited presentations from academia and industry, and a panel discussion of leading experts in the field.

Deadline for submissions is January 15th, so don’t delay. Submit your proposed content today.

OpenCL Rolls Out Maintenance Release and C++ for OpenCL Documentation

OpenCL Rolls Out Maintenance Release and C++ for OpenCL Documentation

Today Khronos released v3.0.6 of the OpenCL Specifications. This is a regular maintenance release with bug fixes and clarifications, an updated address spaces section, new extensions for additional subgroup functions, and an extension for enhanced platform and device version queries. Also, documentation for the C++ for OpenCL V1.0 kernel language is now downloadable from an OpenCL-Docs GitHub repository tag, describing how the language combines C++17 functionality with familiar OpenCL kernel language paradigms. An extension for online compilation of C++ for OpenCL kernels was published earlier this year and offline compilation of C++ for OpenCL kernels has been supported by clang release 9.0 onwards.

PoCL is a portable open source (MIT-licensed) implementation of the OpenCL standard (1.2 with some 2.0 features supported). In addition to being an easily portable multi-device (truely heterogeneous) open-source OpenCL implementation, a major goal of this project is improving interoperability of diversity of OpenCL-capable devices by integrating them to a single centrally orchestrated platform. Also one of the key goals longer term is to enhance performance portability of OpenCL programs across device types utilizing runtime and compiler techniques.

Version 1.6 release Highlights: Support for Clang/LLVM 11.0, improved CUDA performance and features, improved PowerPC support and enhanced OpenCL debugging usage.

OpenMP 5.1 Released With Better Interoperability For CUDA / AMD HIP / OpenCL

It’s been two years already since the release of the OpenMP 5.0 specification and the update released on Friday is quite a worthy update:

OpenMP 5.1 introduces a new interop construct for improving interoperability with non-OpenMP device execution contexts. This aims to improve the portability of OpenMP 5.1+ to non-native interfaces/accelerators. This interop construct is designed with NVIDIA CUDA, AMD ROCm/HIP, and OpenCL in mind. The interop construct is used for dealing with interoperability properties for one or more “foreign runtime environments”.

NSITEXE, Kyoto Microcomputer and Codeplay Software are bringing open standards programming to RISC-V Vector processor for HPC and AI systems

Codeplay Software Ltd, pioneers in enabling acceleration technologies, announced today that software developers working on HPC and AI for embedded systems will be able to take advantage of industry defined open standards from The Khronos Group on RISC-V architectures, thanks to Japan’s New Energy and Industrial Technology Development Organisation (“NEDO”) project in which NSITEXE and Kyoto Microcomputer Co., Ltd. (“KMC”) are participating.

NSITEXE and KMC have ordered an implementation of LLVM for RISC-V Vector Extension Processor (“RVV”), and also Codeplay’s ComputeAorta™ and ComputeCpp™, efficient and high performance implementations of OpenCL and SYCL open standards. In the NEDO project, as a research, NSITEXE develops OpenCL and SYCL compilers from LLVM to utilize RVV, and KMC implements vector syntax to utilize RVV efficiently based on LLVM and Clang. These research developments will contribute to RISC-V community to support open-standard technologies.

Tutorial: Making the most of Arm NN for GPU inference: OpenCL Tuner

Arm NN is an open-source inference engine for CPUs, GPUs and NPUs. It bridges the gap between existing NN frameworks and the underlying IP. Arm NN is built on top of the Arm Compute Library (ACL). This contains a collection of highly optimized low-level functions to accelerate inference on the Arm Cortex-A family of CPU processors and the Arm Mali family of GPUs. For GPUs, ACL uses OpenCL as its compute API. The OpenCL memory model closely maps to the GPU architecture making it possible to implement optimizations that significantly reduce the accessing of global memory. Read on to learn how.

Intel Releases OpenCL Intercept Layer 3.0

The Intel OpenCL Intercept Layer is one of the company’s efforts around helping to improve debugging and analyzing of OpenCL application performance. This cross-platform layer intercepts the OpenCL API calls through the OpenCL ICD loader to analyze/debug CL applications. With the OpenCL Intercept Layer 3.0 release, it has full support for tracing all OpenCL 3.0 APIs. The update also allows for tracing more vendor-specific CL extensions, proper handling of extension APIs from multiple platforms, emulated support for unified shader memory via shared virtual memory, and a number of other enhancements including bug fixes and performance improvements.

On Github: https://github.com/intel/opencl-intercept-layer/releases/tag/v3.0.0

We would like to hear from you. If you have feedback, please leave a comment on the OpenCL 3.0 blog post: https://khr.io/us

Faster Mobile GPU Inference with OpenCL

TensorFlow team announces the official launch of OpenCL-based mobile GPU inference engine for Android, which offers up to ~2x speedup over existing OpenGL backend, on reasonably sized neural networks that have enough workload for the GPU.

devilish