SC20 Banner
November 17-18, 2020

The International Conference for High Performance Computing, Networking, Storage, and Analysis Everywhere We Are #MoreThanHPC

Khronos Sessions

HPC Application Development Using C++ and SYCL

Date and Time: November 9, 2020 | 10AM - 2:00PM EST
Presenters: Tim Mattson (Intel), Michael Wong (Codeplay), Ronan Keryell(Xilinx), Rod Burns(Codeplay), Aksel Alpay (Heidelberg University)

SYCL is a programming model that targets a wide variety of devices (CPUs, GPUs, FPGAs and more) from a single code base. SYCL supports a single-source style of programming from completely standard C++. With increasingly heterogeneous processor roadmaps, a platform-independent model such as SYCL is essential for software developers.

In this tutorial, we introduce SYCL. We start by building a solid foundation to help programmers gain mastery of this language. We then explore how SYCL can be used to write serious applications, covering intermediate to advanced features as well as some of the tools and libraries that support SYCL application development. The tutorial is constructed around mini-applications that represent key design patterns encountered by people who program heterogeneous systems. This helps keep the tutorial grounded on practical knowledge students can immediately apply to their own programming problems.

Performance Evaluation of the Vectorizable Binary Search Algorithms on an FPGA Platform

Date and Time: November 11, 2020 | 3:30PM - 3:40PM EST
Presenters: Zheming Jin (Argonne National Laboratory), Hal Finkel (Argonne National Laboratory)

Field-programmable gate arrays (FPGAs) are becoming promising heterogeneous computing components. In the meantime, high-level synthesis (HLS) tools are pushing the FPGA-based development from the register-transfer level to high-level-language design flow using Open Computing Language (OpenCL), C, and C++. The performance of binary search applications is often associated with irregular memory access patterns to off-chip memory. In this paper, we implement the binary search algorithms using OpenCL, and evaluate their performance on an Intel Arria-10 based FPGA platform. Based on the evaluation results, we implement the grid search in XSBench by vectorizing and replicating the binary search kernel. In addition, we overcome the overhead of kernel vectorization by grouping work-items into work-groups. Our optimizations improve the performance of the grid search using the classic binary search by a factor of 1.75 on the FPGA.

Evaluating FPGA Accelerator Performance with a Parameterized OpenCL Adaptation of Selected Benchmarks of the HPCChallenge Benchmark Suite

Date and Time: November 13, 2020 | 10:35AM - 11:05AM EST
Speakers: Marius Meyer, Tobias Kenter, Christian Plessl

We have developed an OpenCL-based open-source implementation of the HPCC benchmark suite for Xilinx and Intel FPGAs. This benchmark can serve to analyze the current capabilities of FPGA devices, cards, and development tool flows, track progress over time, and point out specific difficulties for FPGA acceleration in the HPC domain. Additionally, the benchmark documents proven performance optimization patterns. We will continue optimizing and porting the benchmark for new generations of FPGAs and design tools and encourage active participation to create a valuable tool for the community.

OpenCL-enabled Parallel Raytracing for Astrophysical Application on Multiple FPGAs with Optical Links

Date and Time: November 13, 2020 | 6PM - 6:25PM EST
Speakers: Norihisa Fujita, Ryohei Kobayashi, Yoshiki Yamaguchi, Taisuke Boku, Kohji Yoshikawa, Makito Abe, Masayuki Umemura

We have optimized the Authentic Radiative Transfer (ART) method to solve space radiative transfer problems in early universe astrophysical simulation on Intel Arria 10 FPGAs as earlier work. In this paper, we optimize it for the latest FPGA -- Intel Stratix 10 and evaluate its performance comparing with GPU implementation on multiple nodes. For the multi-FPGA computing and communication framework, we apply our original system named Communication Integrated Reconfigurable CompUting System (CIRCUS) to realize OpenCL base programming to utilize multiple optical links on FPGA for parallel FPGA processing, and this is the first implementation of real application over CIRCUS.

The oneAPI Software Abstraction for Heterogeneous Computing

Date and Time: November 17, 2020 | 10AM - 11:30AM EST
Moderator: Sujata Tibrewala (Intel)
Panelists: Rafael Asenjo (University of Malaga, Spain), Erik Lindahl (Stockholm University), Xiaozhu Meng (Rice University), Michael Wong (Codeplay), David Hardy (University of Illinois), Maria Garzaran (Intel)

OneAPI is a cross-industry, open, standards-based unified programming model. The oneAPI specification extends existing developer programming models to enable a diverse set of hardware through language, a set of library APIs and a low-level hardware interface to support cross-architecture programming. It builds upon industry standards and provides an open, cross-platform developer stack to improve productivity and innovation. At the core of oneAPI is the DPC++ programming language, which builds on the ISO C++ and Khronos SYCL standards. DPC++ provides explicit parallel constructs and offload interfaces to support a broad range of accelerators. In addition to DPC++, oneAPI also provides libraries for compute- and data-intensive domains; e.g., deep learning, scientific computing, video analytics and media processing. Finally, a low-level hardware interface defines a set of capabilities and services to allow a language runtime system to effectively utilize a hardware accelerator.

Khronos SYCL 2020 Release and ISO C++ 20 status and future directions

Date and Time: November 19, 2020 | 10AM - 11:15AM EST
Panelists: Michael Wong (Codeplay) and Simon Mcintosh-Smith (University of Bristol)

SYCL is an open standard planning a new release and C++ is also releasing C++20 in 2020. After SC17, SC18, and SC19's successful ISO C++ for HPC BoF and SYCL BoF, and with increasing use of C++ in HPC, there was popular demand for updates on the new SYCL 2020 and C++20 features. SYCL is a vendor-neutral way to write ISO C++ that embraces heterogeneous parallelism, especially in ECP's Aurora exascale supercomputer. In this BoF, we have integrated SYCL and C++ BoF so C++ and SYCL experts will explain the new features in SYCL 2020, and C++20 relevant to HPC.