Skip to main content

SYCL (pronounced ‘sickle’) is a royalty-free, cross-platform abstraction layer that:

  • Enables code for heterogeneous and offload processors to be written using modern ISO C++ (at least C++ 17).
  • Provides APIs and abstractions to find devices (e.g. CPUs, GPUs, FPGAs) on which code can be executed, and to manage data resources and code execution on those devices.

SYCL defines abstractions to enable heterogeneous device programming, an important capability in the modern world which has not yet been solved directly in ISO C++. SYCL has evolved with the intent of influencing C++ direction around heterogeneous compute by creating productized proof points that can be considered in the context of C++ evolution.

A major goal of SYCL is to enable different heterogeneous devices to be used in a single application — for example simultaneous use of CPUs, GPUs, and FPGAs. Although optimized kernel code may differ across the architectures (since SYCL does not guarantee automatic and perfect performance portability across architectures), it provides a consistent language, APIs, and ecosystem in which to write and tune code for accelerator architectures. An application can coherently define variants of code optimized for architectures of interest, and can find and dispatch code to those architectures.

SYCL uses generic programming with templates and generic lambda functions to enable higher-level application software to be cleanly coded with optimized acceleration of kernel code across an extensive range of acceleration backend APIs, such as OpenCL and CUDA.

Latest Specification: SYCL 2020

The SYCL 2020 Specification was launched on Feb 9th, 2021. The specification is now publicly available to enable feedback from developers and implementers before release of the SYCL 2020 Adopters Program to enable implementers to be officially conformant. Multiple toolchains now implement major parts of SYCL 2020. SYCL 2020 Revision 7 was released in April 2023.

SYCL 2020 represents a major step forward, featuring over 40 new additions and improvements, including:

  • Unified Shared Memory (USM), enabling code with pointers to work naturally without buffers or accessors
  • Parallel reductions, adding a built-in reduction operation and helping to avoid boilerplate code, providing maximum performance for hardware with builtin operations
  • Work group and sub-group algorithms, enabling efficient operations between work items
  • Class template argument deduction (CTAD) and deduction guides to enable simpler class template instantiation
  • Simplification of accessors, which adds a built-in reduction operation, reduces the burden of boilerplate code and enables simplified C++ patterns
  • Expanded interoperability with different backends, enabling support for backends other than OpenCL
  • Improvements to atomic operations to be closer to C++ atomics to enable more parallel programming freedom

“SYCL 2020’s primary goal is to achieve closer convergence with ISO C++, furthering our work to bring parallel heterogeneous programming to modern C++ through open standards. SYCL can leverage diverse processors to accelerate problems in many application domains including HPC, automotive, and machine learning,” said Michael Wong, Codeplay distinguished engineer, ISO C++ Directions Group and SYCL working group chair. “SYCL has a growing number of implementers and researchers working on real-world applications in markets ranging from supercomputing to embedded processing. The insights from that work, along with the feedback we collected from the SYCL 2020 provisional specification, has enabled the SYCL Working Group to deliver a feature-rich final specification that balances enhanced performance with backwards compatibility. I am excited by the simplicity and higher expressiveness offered by SYCL 2020 and we will continue to evolve SYCL to meet market needs.”

Industry Support for SYCL 2020

“ Our users will benefit from features in the SYCL 2020 specification. New features, such as support for unified memory (USM) and reductions, are important capabilities for programming high-performance-computing hardware. In addition, support for C++17 will allow our users to write better C++ code, with both language features (such as deduction guides) and library features (such as std::optional). Other new features (such as softening the requirements on kernel functions and sharing data between host and devices) are an important step for implementing backend support for SYCL in the Kokkos and RAJA performance portability ecosystems. ”

“ At Cineca, based on our experience, we confirm the value that SYCL is bringing to the development of high-performance computing in a hybrid environment. In fact, through SYCL, it is possible to build a common and portable environment for the development of computing-intensive applications to be executed on HPC architectures configured with floating point accelerators, which allows industries and scientific communities to use the common availability of development tools, libraries of algorithms, accumulated experience. Cineca is already running the distributed Celerity runtime on top of several SYCL implementations on the new Marconi100 cluster, ranked no. 11 in the Top500, providing users with a unified API for both about 4000 NVIDIA Volta V100 GPUs and IBM Power9 host processors. SYCL 2020 is a big step towards a much leaner API that unlocks all the potential provided by modern C++ standards for accelerated data-parallel kernels, making the development of large-scale scientific software easier and more sustainable, either for industrial oriented domain applications for industries, either for scientific domain-oriented applications. ”

“ Codeplay has been deeply involved in SYCL from its original definition and we are now enabling the standard on a range of systems with our ComputeCpp product. We strongly believe SYCL is the only software standard to link all the high performance processors to a unified programming solution. Developers will find that SYCL 2020 refines the standard to streamline their development and adds some crucial new enhancements to improve productivity. ”

“ Imagination recognises the benefit of SYCL across multiple markets. Our software stacks have been designed to improve SYCL performance, enabling a straightforward path to exploit the teraflops of compute performance in our latest IP,. The ability to quickly port workloads from other proprietary APIs is a huge benefit, easing the transition from development on desktop to deployment on embedded systems. SYCL 2020 is a positive step forward for this API, enabling higher levels of performance, which will benefit developers and platform creators. ”

“ The SYCL 2020 final specification brings significant features to the industry that enable C++ developers to more productively build high-performance heterogeneous applications with unified programming across XPU architectures,. Several capabilities pioneered in the open source oneAPI C++/DPC++ compiler, such as unified shared memory, group algorithms, and sub-groups, contributed to this community effort. Open, cross-architecture programming is required for accelerated distributed computing; we look forward to continuing our collaboration to address the needs of the developer ecosystem. ”

“ With thousands of users and a wide range of applications using NERSC’s resources, we must support a wide range of programming models. In addition to directive-based approaches, we see modern C++ language-based approaches to accelerator programming, such as SYCL, as an important component of our programming environment offering for users of Perlmutter. Further, this work supports the productivity of scientific application developers and users through performance portability of applications between Aurora and Perlmutter. ”

“ NSITEXE supports the SYCL 2020 technology, which is gaining attention in embedded applications. SYCL is very important to increase productivity by hiding complexities from users. We are considering adopting this technology in our next generation of IP platforms. ”

Hideki Sugimoto

“ For Renesas, SYCL is a key enabler for automotive ADAS/AD software developers that allows them to easily use the highly-efficient, heterogeneous accelerators of the R-Car SoC Series through the open Khronos standard. ”

“ We are excited about the extensive list of features and improvements released with the new SYCL 2020 specification. The API becomes terser and more developer friendly, while also introducing new ways for expert users to exercise fine-grained control over state-of-the-art hardware features. The move to a generalized backend model opens up new possibilities to integrate with existing legacy solutions, which is especially important in scientific research environments. As co-developers of the Celerity project, together with the University of Salerno, we are welcoming these changes and look forward to applying them within distributed-memory research and industry applications, for example as part of the recently launched EuroHPC LIGATE project. ”

“ Xilinx is excited about the progress achieved with SYCL 2020. This single-source C++ framework unifies host and device code for various kinds of accelerators in the same C++ program. With host-fallback device execution, developers can emulate device code on a CPU, exploring hardware-software co-design for adaptable computing devices. SYCL is now extensible via customizable back-ends, enabling device plug-ins for FPGAs and ACAPs. ”

Ralph Wittig
Fellow, Xilinx

SYCL Implementations in Development

SYCL Academy

The SYCL Academy repository provides materials that can be used for teaching SYCL. The materials are provided using the "Creative Commons Attribution Share Alike 4.0 International" license.

Visit SYCL Academy