After the success of the first seven entries to the ShaderX book series, of GPU Pro and the soon to be released GPU Pro 2, we are looking for authors for GPU Pro 3. The upcoming book will cover advanced rendering techniques that run on the DirectX or OpenGL run-times, or any other run-time with any language available. It will include topics on: Geometry Manipulation; Rendering Techniques; Handheld Devices Programming; Effects in Image Space; Shadows; 3D Engine Design; Graphics Related Tools; Environmental Effects and a dedicated section on General Purpose GPU Programming that will cover CUDA, DirectCompute and OpenCL examples. Proposals are due by March 17th, 2011. Contact details, an example proposal, writing guidelines and a FAQ can be downloaded from gpupro3.blogspot.com.
The nopper.tv website now contains one more OpenGL 3.3 and one OpenGL 4.0 example. The OpenGL 4.0 example is a simple tesselation implementation. There are now 13 different OpenGL source code examples available.
Researchers from the University of Warwick’s Performance Computing and Visualization Department and Oxford University’s eResearch Centre have put together a study: should we install a GPGPU-based system or a more traditional IMB Blue Gene-like supercomputer? The team’s research will present their work at the 1st International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems at the SC10 conference in New Orleans. The Khronos Group will be at Booth # 1132 at SC10.
Currently there are several ways to feed data to the GPU no matter of what API we use and what type of application we develop. In case of OpenGL we have uniform buffers, texture buffers, texture images, etc. The same is true for OpenCL and other compute APIs that even provide more fine-grained memory management taking advantage of the local data store (LDS) available on today’s hardware. In this article I’ll present the memory access performance characteristics of AMD’s Evergreen-class GPUs focusing on what this all means from OpenGL point of view. While most of the data is about the HD5870, the general principles and relative performance characteristics are valid for other GPUs, including ones from other vendors.
Only a few days since AMD released Catalyst 10.10a, they have released Catalyst 10.10c Hotfix with beta support for OpenGL 4.1. NVIDIA also recently released drivers for OpenGL 4.1. See how each driver does on G-Truc's website.
What would you have if you put over 7000 NVIDIA Tesla GPUs together? Chances are you would have the worlds fastest computer. Chinas National University of Defense Technology has put together 7,168 NVIDIA Tesla M2050 (Fermi) GPUs, 14,336 CPUs, 262TB of memory and 2PB of storage, giving them the worlds fastest super computer with a Linpack performance of 2.5 petaflops. Peak performance of the new Tianhe-1A super computer is 4.7 petaflops. Oh, did I mention it draws 4 Megawatts of power! The Tianhe-1A super computer means the Us has lost its top spot in the TOP500.
Rob Farber has an in-depth two part tutorial on OpenCL. The first part of the tutorial will get you going using the ATI Stream software development kit (SDK). Part two of the OpenCL tutorial covers memory spaces and the OpenCL memory hierarchy, as well, how to start thinking in terms of work items and work groups. Both parts contain lots of code examples to help the novice OpenCL programmer get started.