Heterogeneous-Compute Interface for Portability (HIP) is a runtime API and a conversion tool to help make CUDA programs more portable. It was originally contributed by AMD to the open source community with the intention to ease the effort of making CUDA applications also work on AMD’s ROCm platform.
While AMD and NVIDIA share the vast majority of the discrete GPU market, it is useful to make this “CUDA portability enhancement route” available to an even wider set of platforms. Since the Khronos OpenCL standard remains the most widely adopted cross-platform heterogeneous programming API/middleware, it is interesting to study whether HIP could be ported on top of it, expanding its scope potentially to all OpenCL supported devices. We in Customized Parallel Computing group, Tampere University, Finland, are happy to announce that to have worked on such a tool, known as HIPCL, for some time and it’s now published and available in Github.
The first release of HIPCL is a proof-of-concept, but is already useful for end-users. It can run most of the CUDA examples in the HIP repository and the list of supported CUDA applications will grow steadily as we add new features.
Hands On OpenCL is a two-day lecture course introducing OpenCL, the API for writing heterogeneous applications. Provided are slides for around twelve lectures, plus some appendicies, complete with Examples and Solutions in C, C++ and Python. The lecture series finishes with information on porting CUDA applications to OpenCL.
HPC programmers who are tired of managing low-level details when using OpenCL or CUDA to write general purpose applications for GPUs (GPGPU) may be interested in Harlan, a new declarative programming language designed to mask the complexity and eliminate errors common in GPGPU application development. The idea with Harlan is to keep developers focused on the high-level HPC programming challenge at hand, instead of getting bogged down with the nitty gritty details of GPU development and optimization. Harlan’s syntax is based on the language Scheme, and compiles to Khronos Group’s OpenCL.
At the International Broadcasting Convention 2011NVIDIA introduced NVIDIA GPUDirect for Video. This technology enables application developers to deliver higher quality, more realistic on-air graphics—or take faster advantage of the parallel processing power of the GPU for image processing. This is done by permitting industry-standard video I/O devices to communicate directly with NVIDIA professional Quadro and Tesla graphics processing units (GPUs) at ultra-low latency. Nick Rashby, President, AJA Video Systems says “this will allow developers whose apps support AJA video I/O products to take better advantage of the power of NVIDIA Quadro and Tesla GPUs, resulting in low-latency access for both graphics compositing and general purpose processing using CUDA or OpenCL, with all the I/O and performance they depend on from AJA.”
Bryan Catanzaro, NVIDIA Research talks discuss the CUDA and OpenCL programming models, GPU architecture, and how to write high performance code on GPUs, illustrating with case studies from application domains such as image and video processing.
Glare Technologies have announced the release of a new version of their flagship rendering product: Indigo Renderer version 3.0, which now includes support for both OpenCL and CUDA. Indigo is an unbiased, physically based and photo-realistic renderer which simulates the physics of light to achieve near-perfect image realism. With an advanced physical camera model, a super-realistic materials system and the ability to simulate complex lighting situations through Metropolis Light Transport, Indigo is capable of producing the highest levels of realism demanded by architectural and product visualisation.
Students learn with interactive and hands-on sessions about GPU hardware, GPU languages, discovering how best to take advantage of GPUs for their computational needs. The course covers programming in both OpenCL and CUDA, pointing out the similarities and differences along the way. Topics include both the core languages and extensions including those for double precision and interfacing with OpenGL 3D graphics buffers.
SiSoftware has posted two OpenCL benchmarks online. One addresses GPGPU OpenGL performance, and the second CPU OpenGL performance. The conclusion: There is no reason not to port CUDA code to OpenCL now!
NVIDIA has released their CUDA Toolkit 3.2. Lots of new goodness in this version, with special note the new OpenCL support. This means you can now use one toolkit for both CUDA and OpenCL. Support is currently only for Linux and Windows.
VDPAU (Video Decode and Presentation API for Unix) allows Linux systems to offload portions of video decode to the GPU. The resulting video can be post-processed with OpenGL, CUDA, or both. Watch Stephen Warren from NVIDIA explain VDPAU and demonstrate OpenGL texturing of hardware-decoded video frames. Slides (PDF) are also available.
Enj appears to be enjoying the GTC 2010 Conference this week. He brings us an inside view of the conference, and a feel of the different talks on OpenCL and CUDA. If you have 5 minutes, pop over to enja.org, it’ll be worth your time.
OpenCL framework to accelerate an EMRI modeling application using the hardware accelerators – Cell BE and Tesla CUDA GPU. The main goal of this work is to evaluate an emerging computational platform, OpenCL, for scientific computation. Results show OpenCL binary on a par with CUDA SDK. Baseline is an AMD Phenom 2.5Ghz CPU.