Today Intel announced record results on a new benchmark in deep learning and convolutional neural networks (CNN). The test took place in Nanjing City, China, where ZTE’s engineers used Intel’s midrange Arria 10 FPGA for a cloud inferencing application using a CNN algorithm. The benchmark was achieved on a server holding 4S Intel Xeon E5-2670v3 processors running at 2.30GHz, 128GB DDR4; Intel PSG Arria 10 FPGA Development Kit with one 10AGX115 FPGA, 4GB DDR4 SODIMM, Intel Quartus Prime and OpenCL SDK v16.1. Besides the impressive increase in performance, the team at the ZTE Wireless Institute sped design time with the use of the OpenCL programming language.
FotoNation Limited and VeriSilicon Holdings Co., Ltd have entered into an agreement to jointly develop a next generation image processing platform that offers best-in-class programmability, power, performance and area for computer vision (CV), computational imaging (CI) and deep learning. The market-ready IP platform, named IPU 2.0, will be available for customer license and design in the first quarter of 2017. IPU 2.0 offers a unified programing environment and pre-integrated imaging features for a wide range of applications across surveillance, automotive, mobile, IoT and more. IPU 2.0 will use open initiatives such as OpenVX and OpenCL.
Khronos made videos of three presentations from Codeplay at the Khronos Booth. The videos cover "Heterogeneous C++ dispatch: Comparing SYCL to HPX, KoKKos, & Raja, "Khronos SYCL Parallel STL Open-source Project" and "Getting Your Hands on SYCL Using the ComputeCpp Community Edition"
Khronos issued a Request For Quote (RFQ) back in September 2016 to enhance and expand the existing OpenCL 2.1 conformance tests to create an OpenCL 2.2 test suite to be used to define conformance for OpenCL 2.2 implementations. The contract has been awarded to StreamComputing. StreamComputing is a software consultancy company specialized in performance tuned software development for CPU, GPU and FPGA. A large part of their clients hires them for their OpenCL expertise.
Amazon recently announced a developer preview of their new F1 instance. Equipped with Intel Broadwell E5 2686 v4 processors (2.3 GHz base speed, 2.7 GHz Turbo mode on all cores, and 3.0 GHz Turbo mode on one core), up to 976 GB of memory, up to 4 TB of NVMe SSD storage, and one to eight FPGAs, the F1 instances provide you with plenty of resources to complement your core, FPGA-based logic. The specs on the Xilinx FPGA are: Xilinx UltraScale+ VU9P fabricated using a 16 nm process; 64 GiB of ECC-protected memory on a 288-bit wide bus (four DDR4 channels); Dedicated PCIe x16 interface to the CPU; Approximately 2.5 million logic elements; Approximately 6,800 Digital Signal Processing (DSP) engines; Virtual JTAG interface for debugging.
Phoronix has published benchmarks of 13 Kepler/Maxwell/Pascal NVIDIA GeForce graphics cards when testing Blender 2.78's OpenCL renderer. Unfortunately, no AMD OpenCL benchmarks for Blender yet -- the current open-source stack doesn't work until ROCm OpenCL support comes into play and the AMDGPU-PRO stack wasn't working for Blender OpenCL but was falling back to CPU rendering. Read the complete article.
The Mali Graphics Debugger allows developers to trace Vulkan (1.0), OpenGL ES (1.x, 2.x, and 3.x), EGL (1.4), and OpenCL (1.x) API calls in their application and understand frame-by-frame the effect on the application to help identify possible issues.
Codeplay helping ensure software developers are correctly equipped to host their software applications on RISC-V. Codeplay is working extensively with machine learning solutions such as Google with TensorFlow to bridge the gap on RISC-V with OpenCL and SYCL open standards.