Vice President of Research and Development at Codeplay Software
Michael Wong is the Vice President of R&D at Codeplay Software, and is Chair of the C++ Heterogeneous Programming language SYCL. Michael has extensive experience in parallel computing and high-performance computing.
Computer versus the Human Brain in Machine Learning
Inside a human brain, there are about10 billion neurons, each one of them can connect to 10,000 others, and from these connections come … everything. The human brain allows us to compose symphonies, create beautiful works of art, allows us to navigate our world, to probe the universe, and to invent technologies that can do amazing things. Now some of that technology is aimed at replicating the brain that created it: artificial intelligence. But has it even come close to what our brains can do? For ages, computers have done impressive stuff: they cracked codes, mastered chess, operate space crafts. But in the last few years, something has changed. Suddenly computers are doing things that seem much more human. Today, computers can see, understand speech, even write poetry. How is all this possible, and how far will it go? Can we actually build a machine that is as smart as us? One that can imagine, create, even learn on its own? How would a machine like that change society, more importantly how would it change us? And in this talk, we ask the question:
Can we build a brain? And if we could, should we?
State-of-the art machine learning systems typically depend on energetically costly gradient-descent learning over a curated task-specific data set. Despite their successes, these methods are not well suited to building fully autonomous systems such as energy-efficient accelerators targeted by OpenCL. By contrast, the brain uses low-energy local learning rules to discover the causal structure of an environment, forming semantically rich representations without supervision, and therefore exhibiting the required combination of efficiency and flexibility. To investigate these properties, a paradigm shift to dynamic "spike-based" computation is required. Historically, investigating spiking neural models has been a task for specialists, with software that is tailored to specific scientific projects, or that trades flexibility against performance. Here, we present Neurosycl, a high-performance, portable spiking network simulator based on SYCL, with a modern and extensible C++ API. Our aim is to provide the necessary components for non-specialists to build a simulated brain, and to run the constructed models as close to real-time as possible.
Programming Models for Self Driving cars with SYCL and Heterogeneous C++
When writing software to deploy deep neural network inferencing, developers are faced with an overwhelming range of options, from a custom-coded implementation of a single model to using a deep learning framework like TensorFlow or Caffe. If you custom code your own implementation, how do you balance the competing needs of performance, portability and capability? If you use an off-the-shelf framework, how do you get good performance? Codeplay has been building and standardizing developer tools for GPUs and AI accelerators for over 15 years.
This talk will explore the approaches available for implementing deep neural networks in software, from the low-level details of how to map software to the highly parallel processors needed for AI all the way up to major AI frameworks. This will start with the LLVM compiler chain used to compile for most GPUs, through the OpenCL, HSA and SYCL programming standards (including how they compare with CUDA), all the way up to TensorFlow and Caffe and how they affect the key metrics like performance.
SYCL is a Heteroegeneous C++ language that provides the building blocks for building such C++ libraries, where the gap between the hardware agnostic C++ features and the C++ abstractions of the hardware features can be bridged. SYCL has also been released as a free to download Community Edition called ComputeCPP to help you build higher abstractions for neural network, and machine vision, all leading to the ability to program self-driving cars.
As Chair of C++ Standard’s SG14 where the gamers, financial traders, and embedded device programmers have been demanding a heterogeneous programming model, I have been studying programming models that can show us learning experience that enables a future ISO C++ to support heterogeneous devices. The number is actually numerous. My search has brought me through SYCL, HPX, Agency, HCC, OpenMP, OpenACC, OpenCL, C++ AMP, Halide, CUDA, Kokkos, Raja and many others. Yet as performance and power-efficiency become the holy grail of modern C++ applications, the hardware solutions that deliver them differ greatly in architecture decisions and designs. The combination of CPUs, GPUs, FPGAs and custom domain specific hardware is gaining a lot of momentum. In view of this, C++ programming techniques and features are changing as well. Modern C++ standards are enabling more and more parallelism and heterogeneity in the library and language features. This talk will compare many of the most popular model in terms of their memory model, data movement, and execution abstraction.