User Wish List
This page contains users's wish lists for features and functionality in future versions of OpenGL. It should not in any way be taken as an endorsement by the ARB, nor should it be assumed that any future version of OpenGL will have these or anything like them. The order of these features is arbitrary.
GPU-specific off-screen context creation
Description: The ability to create rendering contexts for different GPUs, particularly for off-screen rendering. GL_AMD_GPU_association might be a good example for this.
Benefit: Improved parallelism.
Offline GLSL shader compilation.
Description: Tool or library for offline compilation of GLSL shaders. This would compile into a standartized bytecode-like format (like D3D does). A standartized cross-platform shader binary format would be needed, related to GL_ARB_get_program_binary.
Benefit: A quality offline compiler that does a known set of optimizations would remove a big failure point in existing OpenGL implementations (you never know if GLSL compiler in the driver has some parsing bugs; or whether it performs common cross-platform optimizations at all). On some platforms (mostly mobile), having full robust compiler in the driver could even be prohibitive (a megabyte or so of code).
Description: This is a synthesis of several proposals, but the basic idea is to be able to do things that don't directly involve rendering in a separate thread. Compiling/linking programs, uploading textures, etc.
Benefit: Better performance.
Description: A comprehensive OpenGL conformance test. One that can be run on various OpenGL implementations to test for driver bugs and the like.
Benefit: A benchmark for IHVs to work towards in making sure their drivers do not have certain bugs.
Description: A profile of OpenGL that exposes debugging information, such as logs of calls and so forth.
Benefit: Makes OpenGL easier to debug.
Update: The GL_ARB_debug_output extension implements debug information.
Description: Exposing various performance metrics through the API, possibly as a profiling... em, profile. Specifically, providing at least the informathion that D3D10's profiling functionality provides.
Benefit: Profiling for improved performance.
Direct State Access
Description: Something not entirely unlike the EXT_direct_state_access extension. It would allow users to use objects without binding them to the context.
Benefit: The biggest benefit is getting past the 32 texture limitation imposed by the way texture objects are attached to programs. It would also make Vertex Array Objects a lot more intuitive and less confusing to understand.
Promote EXT_texture_filter_anisotrpoic and EXT_texture_compression_s3tc to core.
Description: See above.
Benefit: I guess it makes some people feel better about using these enums to not write "EXT" at the end.
D3D11-style command buffers.
Description: The ability to queue up a sequence of rendering commands and execute them in a single go. Sort of like display lists, but limited to actually rendering things rather than setting state.
Benefit: Presumably faster rendering of set pieces of geometry.
Bindless graphics stuff.
Description: Implement some of the concepts in the various bindless graphics extensions. Or if not that, then improve/fix whatever it is in VAOs and/or VBOs that make their cache performance so much worse than bindless graphics.
Benefit: Performance, it would seem.
Description: The ability to write shaders for blend operations.
Benefit: Non-linear color encoding with proper blending, shadow mapping improvements, etc.
Description: The ability to designate that the storage for some objects does not need to be preserved (per APPLE_object_purgibility). This may combine with bindless/VAR fixes (ie: locking buffers); the totality of which gives the application significant control over how memory gets used.
Benefit: Improved performance.
Opaque object types
Description: ARB_sync uses a pointer type, the first OpenGL object to do so. If the direct state access thing is going to go through, that will create a lot of new APIs in and of itself. You may as well add the ability to convert a GLuint into an object pointer while you're at it. The DSA functions would operate only on pointers, while the binding functions would need analogs that also take pointers.
Benefit: Better 64-bit support. Possibly faster binding and rendering.
Program state separation
Description: Some mechanism to fully separate a program object data from the state for that object. Right now, UBOs allow you to separate a program from most of its state, but texture state remains fixed.
Benefit: Faster state changes when using the same program data.
on device image copies
Benefit: Orthogonality and efficiency
write to specific samples within a shader
Description: ARB_texture_multisample (core in 3.2) allows fetching a specific sample from a multisampled buffer. It would be nice if there was a way to write to specific samples from within the shader.
Benefit: Potentially fewer passes when anti-aliasing in a deferred renderer.
glGetString( GL_DRIVER_VERSION )
Description: Returns a string which, for the vendor returned by glGetString( GL_VENDOR ), uniquely identifies the exact driver version in use. eg. "18.104.22.16838"
Benefit: When an application fails on an end-users machine, the graphics driver version can be written to the log file. The application can also detect a particular driver version that contains a known bug, and either work around it or ask user to upgrade.
Vertical Sync event
Description: An event object that is set by VSYNC and cleared when read.
Benefit: SwapBuffers currently does 3 things, it flushes the command queue to GPU, adds a command to command queue that swaps the front and back buffers, and optionally waits for VSync. A Vsync event allows OpenCL or FBO rendering tasks to be performed between the SwapBuffers call and the VSync, ie. when the GPU would otherwise just be waiting.
Dropped frame counter.
Description: Query object that returns the number of VSync's that occured between the last two SwapBuffer calls.
Benefit: Allows detection of dropped frames so that the application can reduce the workload on the GPU.
Prioritised command queues.
Description: The frame-buffer rendering context is given priority over FBO rendering and OpenCL tasks. FBO and OpenCL command queues are executed when the GPU has executed SwapBuffers in the primary context and is waiting for VSync. When VSync occurs and the front/back buffers are swapped, the currently executing command queue is suspended as soon as possible, and commands waiting in the primary context are executed.
Benefit: Prevents dropped frames from causing jittery animation. Using an old shadow-map for one frame is much less noticable than the entire screen freezing. This also ensures that the GPU does not stall due to lack of work, which would be likely if using a VSync event object to try to control the GPU workload with the CPU.
Description: Some kind of immutable objects to replace display for thing like macro object of set of states. Kind of "blend object", "test object", "rasterizer object", that king of things which is really efficient to do!
It could be either a single generic immutable state object or dedicated objects ... It replaces display list in these cases when used as a macro immutable object. I pretty sure that drivers wise, it's a good idea too.
Benefit: Simplify states management by the OpenGL software. Replace deprecated display lists when used for such purpose. Simplify drivers?
Depth range equation
Description: glDepthRangeEquation(GLclampf a, GLclampf b), where the final Zw = a + b Zd and a + b = 1. For DX, a = 0 and b = 1. For GL it's business as usual with a=(n+f)/2 and b=(f-n)/2 (usually a=1/2 and b=1/2). It now seems somewhat inappropriate to make assumptions about quantities derived from them (i.e. that post projective z is in the range [-1, 1])
Benefit: In keeping with the recent effort to ease transitions from DX
GLSL diagonal matrices
Description: Replace code like: vec4 v; mat4 m; ... m = v.x; m = v.y; m = v.z; m = v.w;
by mat4 m(v);
Benefit: Simplify the code
NV_texture_barrier in core
Description: Move NV_texture_barrier into core.
Benefit: Allows a limited form of programmable blending.
Real-time rendering mode
Description: Garantees predictable and well-defined performance metrics for shaders, memory management, etc. For instance, in this mode, it would be prohibited for shaders to recompile during a draw call when some of their inputs take some specific characteristics. Well-defined memory management would mean that the application is able to control when buffers (textures, VBOs) are actually sent to VRAM.
Benefit: This would avoid large and unpredictable performance drops which are at best very difficult to avoid. It would also provide better application control of the data transfered to the GPU to make predictable subbanded streaming of data more reliable.
These are things that used to be on this list that have been resolved by the ARB, either in ARB extensions or GL core:
- ARB_instanced_arrays brought into the 3.3 core.
- ARB_texture_swizzling brought into the 3.3 core.
- ARB_tessellation_shader created and brought into 4.0 core. Provides access to tessellation shaders.
- ARB_blend_func_extended created and brought into 3.3 core. Allows a special blend mode where the blend equation takes 2 colors from the fragment program instead of one.
- ARB_explicit_attrib_location created and brought into 3.3 core. Allows specifying attribute locations and fragment output locations in the shader.
- ARB_sampler_objects created and brought into 3.3 core. Allows separation between sampler state and texture object state.
- ARB_shader_bit_encoding created and brought into GLSL 3.3. Allows converting floating point number to to integers representing their bit encoding.
- ARB_separate_shader_objects created and brought into 4.1 core. Allows separation of programs based on shader stage. Allows user-defined varyings.
- ARB_get_program_binary created and brought into 4.1 core. Allows extraction of binary compiled shaders, and the later reuse of those shaders without having to re-compile and link.