Skip to main content

Khronos in Machine Learning

Neural networks are a class of machine learning algorithms, originally inspired by the brain, which are helping drive some of the most innovative technologies for face recognition, speech-to-text, language understanding, and much more. The strength of neural networks comes from their ability to learn from data rather than a human expert having to program their behavior. This ability is made possible by machine learning.

Machine Learning with Khronos standards

With machine learning, a system extracts high-dimensional data from the real world: it can take video or other data, process the images, analyze the resulting data, draw conclusions, then make decisions on actions to take based on the analysis. One immediate application for this is in the area of self-driving cars, but it is also having an impact in the medical, industrial, security, safety, and other domains.

Systems built for neural networks and machine learning often base their performance on accelerators, GPUs, FPGAs, and SOCs. This is an area in which Khronos is a recognized expert, with the world’s leading standards for Heterogeneous computing and graphics for embedded devices, desktops, and large HPC servers.

Machine learning standards from Khronos

Khronos™ offers a suite of open standards for key machine learning processes, designed collaboratively by industry and academic experts: including computer vision for data acquisition, graphics and heterogeneous dispatch, an efficient neural network transfer format, and more.

Machine Learning Diagram
  • OpenVX™ and SYCL™ for efficient and highly performant vision and AI applications
  • Vulkan®, SYCL, OpenCL, and SPIR™ for neural network training frameworks and inferencing
  • NNEF™ for connecting the output from training algorithms to different inference engines

NNEF: Neural Network Exchange Format

OpenVX Logo

NNEF (Neural Networks Exchange Format), Khronos' newest standard for machine learning, is now available for the first time in its official form. NNEF reduces machine learning deployment fragmentation by enabling a rich mix of neural network training tools and inference engines to be used by applications across a diverse range of devices and platforms.

Khronos has initiated a series of open-source projects, including an NNEF syntax parser/validator and example exporters from a selection of frameworks including TensorFlow, Caffe, and Caffe2. These tools are freely available and Khronos welcomes the participation of the machine learning community to make NNEF useful for their own workflows. In addition, NNEF is working closely with the Khronos OpenVX working group to enable ingestion of NNEF files.

Khronos NNEF landing page for links to the specification, tools, and more.

OpenVX: Portable, Power-efficient Vision Processing

OpenVX Logo

OpenVX is a mature computer vision API with dozens of implementations available on a wide range of computing hardware, including CPUs, GPU, DSPs, and dedicated hardware blocks. OpenVX pioneered the use of computation graphs in embedded processing APIs for data-intensive application such as computer vision and machine learning. Today computation graphs are at the heart of popular machine learning frameworks such as TensorFlow and PyTorch.

The base OpenVX API, currently at version 1.2.1, includes a rich set of classical computer vision operators and a framework for creating compiling, and executing computation graphs. The neural network extension to OpenVX includes operations for many common neural network layer functions, enabling acceleration of neural network inference on a wide range of specialized hardware. The OpenVX API enables classical and neural network computations to be integrated seamlessly in a single graph. Implementations can then efficiently map and execute these graphs on diverse hardware that may include a heterogeneous collection of processors. OpenVX implementations automatically execute each portion of the graph on the most appropriate processor, and efficiently manage data communication between the processors. Conformance tests for the base API and neural network extension ensure that the behavior of every implementation is consistent, regardless of differences in the underlying hardware.

Finally, the recently introduced kernel import extension enables entire neural networks to be imported into an OpenVX graph as a single node in the graph. The imported neural network can be described in any format the implementation supports, including the new Khronos NNEF standard. The imported neural network nodes can be easily connected to classical computer vision nodes for pre- and post- processing. As with all other OpenVX features, the implementation automatically handles all the communication between the neural network and other nodes.

Khronos OpenVX landing page for links to links to the specification, implementations, extensions, and more.

SYCL: C++ Single-source Heterogeneous Programming for neural network training and inference applications


SYCL (pronounced ‘sickle’) is a higher-level programming model for OpenCL as a single-source domain specific embedded language (DSEL) based on pure C++11 for SYCL 1.2.1 and C++14/17 for SYCL in the future. SYCL enables single source development where C++ template functions can contain both host and device code to construct complex algorithms that use OpenCL acceleration and then re-use these throughout their source code on different types of data.

Royalty-free, cross-platform SYCL is an important open heterogeneous programming model that provides an alternative to CUDA. It can dispatch to graphics processors from Intel, AMD, ARM, Nvidia, Imagination Technologies, or any number of embedded devices or specialized “Tensor Processing Units” appearing in self-driving cars. SYCL capitalizes on the power of modern C++ and combines with dispatch to accelerators, leading to key performance improvements for machine learning algorithms that utilize kernel fusion techniques with C++ meta-template techniques and lazy expression evaluation.

The SYCL ecosystem is rapidly expanding to address the needs of machine learning applications, delivering the ability to effectively accelerate performance, manage data, adapt algorithms to the hardware, and make use of fixed-function hardware. SYCL brings efficient computation to machine learning to deal with algorithm-sensitive convolutions, matrix multiply, component-wise operations, reductions, and CPU synchronization.

Khronos SYCL landing page for links to the specification, implementations, tutorials, and more.

Khronos members in the machine learning ecosystem

Khronos members are advancing the field of machine learning by working together to help define, develop, and refine industry-important standards for the benefit and enablement of the wider machine learning/neural network community. Members enjoy early access to draft specifications and network with industry leaders.

Become a Khronos member