NVIDIA graphics driver for Windows version 378.66 is now offering some OpenCL 2.0 support. From the release notes: "New features in OpenCL 2.0 are available in the driver for evaluation purposes only." Some known issues include: The current implementation is limited to 64-bit platforms only; OpenCL 2.0 allows kernels to be enqueued with global_work_size larger than the compute capability of the NVIDIA GPU. The current implementation supports only combinations of global_work_size and local_work_size that are within the compute capability of the NVIDIA GPU; For executing kernels (whether from the host or the device), OpenCL 2.0 supports non-uniform ND-ranges where global_work_size does not need to be divisible by the local_work_size. This capability is not yet supported in the NVIDIA driver, and therefore not supported for device side kernel enqueues.
Today Intel announced record results on a new benchmark in deep learning and convolutional neural networks (CNN). The test took place in Nanjing City, China, where ZTE’s engineers used Intel’s midrange Arria 10 FPGA for a cloud inferencing application using a CNN algorithm. The benchmark was achieved on a server holding 4S Intel Xeon E5-2670v3 processors running at 2.30GHz, 128GB DDR4; Intel PSG Arria 10 FPGA Development Kit with one 10AGX115 FPGA, 4GB DDR4 SODIMM, Intel Quartus Prime and OpenCL SDK v16.1. Besides the impressive increase in performance, the team at the ZTE Wireless Institute sped design time with the use of the OpenCL programming language.
• Tags: OpenCL
Only 10 days left to get your OpenCL abstracts and papers in for IWOCL 2017 in Toronto.
Imagination Technologies announced PowerVR ‘Series8XE Plus’ GPUs this week. The Series8XE Plus GPUs include support for the latest APIs including OpenGL ES 3.2, OpenCL 1.2, Vulkan 1.0 and OpenVX 1.1.
FotoNation Limited and VeriSilicon Holdings Co., Ltd have entered into an agreement to jointly develop a next generation image processing platform that offers best-in-class programmability, power, performance and area for computer vision (CV), computational imaging (CI) and deep learning. The market-ready IP platform, named IPU 2.0, will be available for customer license and design in the first quarter of 2017. IPU 2.0 offers a unified programing environment and pre-integrated imaging features for a wide range of applications across surveillance, automotive, mobile, IoT and more. IPU 2.0 will use open initiatives such as OpenVX and OpenCL.
Khronos made videos of three presentations from Codeplay at the Khronos Booth. The videos cover "Heterogeneous C++ dispatch: Comparing SYCL to HPX, KoKKos, & Raja, "Khronos SYCL Parallel STL Open-source Project" and "Getting Your Hands on SYCL Using the ComputeCpp Community Edition"
Khronos issued a Request For Quote (RFQ) back in September 2016 to enhance and expand the existing OpenCL 2.1 conformance tests to create an OpenCL 2.2 test suite to be used to define conformance for OpenCL 2.2 implementations. The contract has been awarded to StreamComputing. StreamComputing is a software consultancy company specialized in performance tuned software development for CPU, GPU and FPGA. A large part of their clients hires them for their OpenCL expertise.
Amazon recently announced a developer preview of their new F1 instance. Equipped with Intel Broadwell E5 2686 v4 processors (2.3 GHz base speed, 2.7 GHz Turbo mode on all cores, and 3.0 GHz Turbo mode on one core), up to 976 GB of memory, up to 4 TB of NVMe SSD storage, and one to eight FPGAs, the F1 instances provide you with plenty of resources to complement your core, FPGA-based logic. The specs on the Xilinx FPGA are: Xilinx UltraScale+ VU9P fabricated using a 16 nm process; 64 GiB of ECC-protected memory on a 288-bit wide bus (four DDR4 channels); Dedicated PCIe x16 interface to the CPU; Approximately 2.5 million logic elements; Approximately 6,800 Digital Signal Processing (DSP) engines; Virtual JTAG interface for debugging.
Phoronix has published benchmarks of 13 Kepler/Maxwell/Pascal NVIDIA GeForce graphics cards when testing Blender 2.78's OpenCL renderer. Unfortunately, no AMD OpenCL benchmarks for Blender yet -- the current open-source stack doesn't work until ROCm OpenCL support comes into play and the AMDGPU-PRO stack wasn't working for Blender OpenCL but was falling back to CPU rendering. Read the complete article.
The Mali Graphics Debugger allows developers to trace Vulkan (1.0), OpenGL ES (1.x, 2.x, and 3.x), EGL (1.4), and OpenCL (1.x) API calls in their application and understand frame-by-frame the effect on the application to help identify possible issues.