Difference between revisions of "User Wish List"

From OpenGL Wiki
Jump to navigation Jump to search
(initial version)
 
(Updates for 4.3)
 
(77 intermediate revisions by 6 users not shown)
Line 1: Line 1:
Feature name: Compiled Shader Caching
+
This page contains users's wish lists for features and functionality in future versions of OpenGL. It should not in any way be taken as an endorsement by the [[OpenGL Architectural Review Board|ARB]], nor should it be assumed that any future version of OpenGL will have these or anything like them. The order of these features is arbitrary.
  
Description: The ability to store compiled shaders in some format, so that subsequent executions of the programs will not require a full compile/link step. Or at least, will not require it unless drivers have changed.
+
== Ability to select texture origin ==
  
Benefit: Improve program initialization/level loading (if shaders are part of level data).
+
'''Description:''' The ability to select the origin of textures, either top-left or bottom-left. Essentially this is ARB_fragment_coord_conventions for textures.
  
 +
'''Benefit:''' Improved interoperability with DirectX.
  
Feature name: GPU-specific off-screen context creation
+
== GPU-specific off-screen context creation ==
  
Description: The ability to create rendering contexts for different GPUs, particularly for off-screen rendering. GL_AMD_GPU_association might be a good example for this.
+
'''Description:''' The ability to create rendering contexts for different GPUs, particularly for off-screen rendering. GL_AMD_GPU_association might be a good example for this.
  
Benefit: Improved parallelism.
+
'''Benefit:''' Improved parallelism.
  
 +
== Offline GLSL shader compilation. ==
  
Feature name: Decouple texture filtering state from texture objects.
+
'''Description:''' Tool or library for offline compilation of GLSL shaders. This would compile into a standartized bytecode-like format (like D3D does). A standartized cross-platform shader binary format would be needed, related to GL_ARB_get_program_binary.
  
Description: This involves being able to separate texture objects (the images themselves) from the state that deals with how shaders sample this. This could be done by creating a new object type, allowing sampler objects in GLSL to specify filtering state, or some other mechanism.
+
'''Benefit:''' A quality offline compiler that does a known set of optimizations would remove a big failure point in existing OpenGL implementations (you never know if GLSL compiler in the driver has some parsing bugs; or whether it performs common cross-platform optimizations at all). On some platforms (mostly mobile), having full robust compiler in the driver could even be prohibitive (a megabyte or so of code).
  
Benefit: Allows using texture objects in different ways in different places.
+
[http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=282235 OpenGL.org Topic]
  
 +
== Multithreading. ==
  
Feature name: GLSL shader precompilation.
+
'''Description:''' This is a synthesis of several proposals, but the basic idea is to be able to do things that don't directly involve rendering in a separate thread. Compiling/linking programs, uploading textures, etc.
  
Description: There was no clear propsal for what this means, but it was mentioned several times in the thread. This would appear to mean having an off-line tool that compiles GLSL into something that you can feed into any implementation to create programs. This something would likely be a form of the ARB assembly that supports modern functionality.
+
'''Benefit:''' Better performance.
  
Benefit: Presumably, the compile/link time of this precompiled format would be lower than that of GLSL itself.
+
== Conformance test. ==
  
 +
'''Description:''' A comprehensive OpenGL conformance test. One that can be run on various OpenGL implementations to test for driver bugs and the like.
  
Feature name: Multithreading.
+
'''Benefit:''' A benchmark for IHVs to work towards in making sure their drivers do not have certain bugs.
  
Description: This is a synthesis of several proposals, but the basic idea is to be able to do things that don't directly involve rendering in a separate thread. Compiling/linking programs, uploading textures, etc.
+
[http://cgit.freedesktop.org/piglit/ piglit is an open-source OpenGL test suite]
  
Benefit: Better performance.
+
== Debug Profile. ==
  
 +
'''Description:''' A profile of OpenGL that exposes debugging information, such as logs of calls and so forth.
  
Feature name: Conformance test.
+
'''Benefit:''' Makes OpenGL easier to debug.
  
Description: A comprehensive OpenGL conformance test. One that can be run on various OpenGL implementations to test for driver bugs and the like.
+
'''Update:''' The GL_ARB_debug_output extension implements debug information.
  
Benefit: A benchmark for IHVs to work towards in making sure their drivers do not have certain bugs.
+
== Performance metrics ==
  
 +
'''Description:''' Exposing various performance metrics through the API, possibly as a profiling... em, profile. Specifically, providing at least the informathion that D3D10's profiling functionality provides.
  
Feature name: Debug Profile.
+
'''Benefit:''' Profiling for improved performance.
  
Description: A profile of OpenGL that exposes debugging information, such as logs of calls and so forth.
+
== Direct State Access ==
  
Benefit: Makes OpenGL easier to debug.
+
'''Description:''' Something not entirely unlike the EXT_direct_state_access extension. It would allow users to use objects without binding them to the context.
  
 +
'''Benefit:''' The biggest benefit is getting past the 32 texture limitation imposed by the way texture objects are attached to programs. It would also make Vertex Array Objects a lot more intuitive and less confusing to understand.
  
Feature name: Use of "semantics" in GLSL.
+
== Promote EXT_texture_filter_anisotrpoic and EXT_texture_compression_s3tc to core. ==
  
Description: The ability to specify in GLSL what vertex attribute index or fragment output index a particular vertex shader input or fragment shader output uses.
+
'''Description:''' See above.
  
Benefit: API cleanup. No need for the user to externally provide mappings for these anymore.
+
'''Benefit:''' I guess it makes some people feel better about using these enums to not write "EXT" at the end.
  
 +
== D3D11-style command buffers. ==
  
Feature name: Performance metrics
+
'''Description:''' The ability to queue up a sequence of rendering commands and execute them in a single go. Sort of like display lists, but limited to actually rendering things rather than setting state.
  
Description: Exposing various performance metrics through the API, possibly as a profiling... em, profile. Specifically, providing at least the informathion that D3D10's profiling functionality provides.
+
'''Benefit:''' Presumably faster rendering of set pieces of geometry.
  
Benefit: Profiling for improved performance.
+
== Bindless graphics stuff. ==
  
 +
'''Description:''' Implement some of the concepts in the various bindless graphics extensions. Or if not that, then improve/fix whatever it is in VAOs and/or VBOs that make their cache performance so much worse than bindless graphics.
  
Feature name: Raw bit conversion in GLSL
+
'''Benefit:''' Performance, it would seem.
  
Description: The ability to convert the bitpattern of an integer into a float and vice-versa. Basically "reinterpret_cast" for GLSL.
+
[http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=256729 OpenGL.org topic]
  
Benefit: Data compression, mainly.
+
== Blend shaders ==
  
 +
'''Description:''' The ability to write shaders for blend operations.
  
Feature name: Separate shader objects.
+
'''Benefit:''' Non-linear color encoding with proper blending, shadow mapping improvements, etc.
  
Description: The ability to attach multiple programs together, ala EXT_separate_shader_objects. Only unlike that extension, it should actually work with user-defined in/out variables, instead of being restricted to just pre-defined ones.
+
[http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=259496 OpenGL.org topic]
  
Benefit: Everything that EXT_separate_shader_objects says about why the extension exists.
+
== Object purgability. ==
  
 +
'''Description:''' The ability to designate that the storage for some objects does not need to be preserved (per APPLE_object_purgibility). This may combine with bindless/VAR fixes (ie: locking buffers); the totality of which gives the application significant control over how memory gets used.
  
Feature name: Direct State Access
+
'''Benefit:''' Improved performance.
  
Description: Something not entirely unlike the EXT_direct_state_access extension. It would allow users to use objects without binding them to the context.
+
== Opaque object types ==
  
Benefit: The biggest benefit is getting past the 32 texture limitation imposed by the way texture objects are attached to programs. It would also make Vertex Array Objects a lot more intuitive and less confusing to understand.
+
'''Description:''' ARB_sync uses a pointer type, the first OpenGL object to do so. If the direct state access thing is going to go through, that will create a lot of new APIs in and of itself. You may as well add the ability to convert a GLuint into an object pointer while you're at it. The DSA functions would operate only on pointers, while the binding functions would need analogs that also take pointers.
  
 +
'''Benefit:''' Better 64-bit support. Possibly faster binding and rendering.
  
Feature name: Promote EXT_texture_filter_anisotrpoic and EXT_texture_compression_s3tc to core.
+
== Program state separation ==
  
Description: See above.
+
'''Description:''' Some mechanism to fully separate a program object data from the state for that object. Right now, UBOs allow you to separate a program from most of its state, but texture state remains fixed.
  
Benefit: I guess it makes some people feel better about using these enums to not write "EXT" at the end.
+
'''Benefit:''' Faster state changes when using the same program data.
  
 +
'''Benefit:''' Orthogonality and efficiency
  
Feature name: Write to the blend color inside the shader.
+
== glGetString( GL_DRIVER_VERSION ) ==
  
Description: There is a constant blend color that can be used as part of the blend equations. This feature would allow a fragment shader to modify this color on a per-fragment basis.
+
'''Description:''' Returns a string which, for the vendor returned by glGetString( GL_VENDOR ), uniquely identifies the exact driver version in use. eg. "6.14.11.9038"
  
Benefit: Not stated, though it basically allows you to have 2 input colors in your blend equations.
+
'''Benefit:''' When an application fails on an end-users machine, the graphics driver version can be written to the log file.
This allows to modify the blend-equation without the need to use the alpha-value of the fragment as the modification factor. This in turn allows to use the alpha-channel for useful things and does not force me to waste it simply to be able to modify the blend-equation on a per-fragment basis from within the shader.
+
The application can also detect a particular driver version that contains a known bug, and either work around it or ask user to upgrade.
  
Feature name: D3D11-style command buffers.
+
== Vertical Sync event ==
  
Description: The ability to queue up a sequence of rendering commands and execute them in a single go. Sort of like display lists, but limited to actually rendering things rather than setting state.
+
'''Description:''' An event object that is set by VSYNC and cleared when read.
  
Benefit: Presumably faster rendering of set pieces of geometry.
+
'''Benefit:''' SwapBuffers currently does 3 things, it flushes the command queue to GPU, adds a command to command queue that swaps the front and back buffers, and optionally waits for VSync.
 +
A Vsync event allows OpenCL or FBO rendering tasks to be performed between the SwapBuffers call and the VSync, ie. when the GPU would otherwise just be waiting.
  
 +
== Dropped frame counter. ==
  
Feature name: Bindless graphics stuff.
+
'''Description:''' Query object that returns the number of VSync's that occured between the last two SwapBuffer calls.
  
Description: Implement some of the concepts in the various bindless graphics extensions. Or if not that, then improve/fix whatever it is in VAOs and/or VBOs that make their cache performance so much worse than bindless graphics.
+
'''Benefit:''' Allows detection of dropped frames so that the application can reduce the workload on the GPU.
  
Benefit: Performance, it would seem.
+
== Prioritised command queues. ==
  
 +
'''Description:''' The frame-buffer rendering context is given priority over FBO rendering and OpenCL tasks.
 +
FBO and OpenCL command queues are executed when the GPU has executed SwapBuffers in the primary context and is waiting for VSync.
 +
When VSync occurs and the front/back buffers are swapped, the currently executing command queue is suspended as soon as possible, and commands waiting in the primary context are executed.
  
Feature name: Blend shaders
+
'''Benefit:''' Prevents dropped frames from causing jittery animation. Using an old shadow-map for one frame is much less noticable than the entire screen freezing.
 
+
This also ensures that the GPU does not stall due to lack of work, which would be likely if using a VSync event object to try to control the GPU workload with the CPU.
Description: The ability to write shaders for blend operations.
 
 
 
Benefit: Adding lots of new behaviors.
 
 
 
 
 
Feature name: Includes in shaders
 
 
 
Description: The ability to have #include-type behavior in shader programs. Presumably the application would provide a callback or some such during compilation that would return a string for the "file" requested in the #include.
 
  
Benefit: More shader file diversity, without having to "pre-parse" the shader.
+
== Immutable objects ==
  
 +
'''Description:''' Some kind of immutable objects to replace display for thing like macro object of set of states. Kind of "blend object", "test object", "rasterizer object", that king of things which is really efficient to do!
  
Feature name: Object purgability.
+
It could be either a single generic immutable state object or dedicated objects ... It replaces display list in these cases when used as a macro immutable object. I pretty sure that drivers wise, it's a good idea too.
  
Description: The ability to designate that the storage for some objects does not need to be preserved (per APPLE_object_purgibility). This may combine with bindless/VAR fixes (ie: locking buffers); the totality of which gives the application significant control over how memory gets used.
+
'''Benefit:''' Simplify states management by the OpenGL software. Replace deprecated display lists when used for such purpose. Simplify drivers?
  
Benefit: Improved performance.
+
== Depth range equation ==
  
 +
'''Description:''' glDepthRangeEquation(GLclampf a, GLclampf b), where the final Zw = a + b Zd and a + b = 1.
 +
For DX, a = 0 and b = 1. For GL it's business as usual with a=(n+f)/2 and b=(f-n)/2 (usually a=1/2 and b=1/2).
 +
It now seems somewhat inappropriate to make assumptions about quantities derived from them (i.e. that post projective z is in the range [-1, 1])
  
Feature name: Opaque object types.
+
'''Benefit:''' In keeping with the recent effort to ease transitions from DX
  
Description: ARB_sync uses a pointer type, the first OpenGL object to do so. If the direct state access thing is going to go through, that will create a lot of new APIs in and of itself. You may as well add the ability to convert a GLuint into an object pointer while you're at it. The DSA functions would operate only on pointers, while the binding functions would need analogs that also take pointers.
+
[http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=263688 OpenGL.org topic]
  
Benefit: Better 64-bit support. Possibly faster binding and rendering.
+
== GLSL diagonal matrices ==
  
 +
'''Description:'''
 +
Replace code like:
 +
vec4 v;
 +
mat4 m;
 +
...
 +
m[0][0] = v.x;
 +
m[1][1] = v.y;
 +
m[2][2] = v.z;
 +
m[3][3] = v.w;
  
Feature name: Program state separation
+
by
 +
mat4 m(v);
  
Description: Some mechanism to fully separate a program object data from the state for that object. Right now, UBOs allow you to separate a program from most of its state, but texture state remains fixed.
+
'''Benefit:'''
 +
Simplify the code
  
Benefit: Faster state changes when using the same program data.
+
== NV_texture_barrier in core ==
 
 
Feature name: on device image copies
 
 
 
Description:
 
http://www.opengl.org/registry/specs/NV/copy_image.txt
 
 
 
Benefit: Orthogonality and efficiency
 
 
 
Feature name:
 
write to specific samples within a shader
 
 
 
Description:
 
ARB_texture_multisample (core in 3.2) allows fetching a specific sample from a multisampled buffer. It would be nice if there was a way to write to specific samples from within the shader.
 
 
 
Benefit:
 
Potentially fewer passes when anti-aliasing in a deferred renderer.
 
 
 
Feature name: glGetString( GL_DRIVER_VERSION )
 
 
 
Description: Returns a string which, for the vendor returned by glGetString( GL_VENDOR ), uniquely identifies the exact driver version in use. eg. "6.14.11.9038"
 
 
 
Benefit: When an application fails on an end-users machine, the graphics driver version can be written to the log file.
 
The application can also detect a particular driver version that contains a known bug, and either work around it or ask user to upgrade.
 
 
 
 
 
Feature name: Vertical Sync event
 
 
 
Description: An event object that is set by VSYNC and cleared when read.
 
Benefit: SwapBuffers currently does 3 things, it flushes the command queue to GPU, adds a command to command queue that swaps the front and back buffers, and optionally waits for VSync.
 
A Vsync event allows OpenCL or FBO rendering tasks to be performed between the SwapBuffers call and the VSync, ie. when the GPU would otherwise just be waiting.
 
 
 
 
 
Feature name: Dropped frame counter.
 
 
 
Description: Query object that returns the number of VSync's that occured between the last two SwapBuffer calls.
 
 
 
Benefit: Allows detection of dropped frames so that the application can reduce the workload on the GPU.
 
 
 
 
 
Feature name: Prioritised command queues.
 
 
 
Description: The frame-buffer rendering context is given priority over FBO rendering and OpenCL tasks.
 
FBO and OpenCL command queues are executed when the GPU has executed SwapBuffers in the primary context and is waiting for VSync.
 
When VSync occurs and the front/back buffers are swapped, the currently executing command queue is suspended as soon as possible, and commands waiting in the primary context are executed.
 
 
 
Benefit: Prevents dropped frames from causing jittery animation. Using an old shadow-map for one frame is much less noticable than the entire screen freezing.
 
This also ensures that the GPU does not stall due to lack of work, which would be likely if using a VSync event object to try to control the GPU workload with the CPU.
 
  
 +
'''Description:''' Move NV_texture_barrier into core.
  
Feature name: GL_ARB_Tesselation extension
+
'''Benefit:''' Allows a limited form of programmable blending.
  
Description: Support for tesselation hardware on next generation GPU's
+
== Real-time rendering mode ==
  
Benefit: Reduces required bandwidth between the CPU and GPU.
+
'''Description:''' Garantees predictable and well-defined performance metrics for shaders, memory management, etc. For instance, in this mode, it would be prohibited for shaders to recompile during a draw call when some of their inputs take some specific characteristics. Well-defined memory management would mean that the application is able to control when buffers (textures, VBOs) are actually sent to VRAM.
Allows smoothly curved silhouettes to be generated and more complex surface detail than can be done with bump mapping.
 
  
 +
'''Benefit:''' This would avoid large and unpredictable performance drops which are at best very difficult to avoid. It would also provide better application control of the data transfered to the GPU to make predictable subbanded streaming of data more reliable.
  
Feature name: Program state objects.
+
== Fulfilled Wishes ==
  
Description: Each material has a state object that defines which shaders, textures, samplers and uniforms are used to render it.
+
These are things that used to be on this list that have been resolved by the ARB, either in ARB extensions or GL core:
Changing materials is done by selecting a new state object instead of having to issue multiple commands that have high driver overhead for verification.
 
  
Benefit: Display lists are no longer required for fast material changes.
+
* ARB_instanced_arrays brought into the 3.3 core.
EXT_texture_swizzle as core.
+
* ARB_texture_swizzling brought into the 3.3 core.
 +
* ARB_tessellation_shader created and brought into 4.0 core. Provides access to tessellation shaders.
 +
* ARB_blend_func_extended created and brought into 3.3 core. Allows a special blend mode where the blend equation takes 2 colors from the fragment program instead of one.
 +
* ARB_explicit_attrib_location created and brought into 3.3 core. Allows specifying attribute locations and fragment output locations in the shader.
 +
* ARB_sampler_objects created and brought into 3.3 core. Allows separation between sampler state and texture object state.
 +
* ARB_shader_bit_encoding created and brought into GLSL 3.3. Allows converting floating point number to to integers representing their bit encoding.
 +
* ARB_sample_shading brought into GLSL 4.00. Allows a write mask for sample writes. Also ARB_image_load_store allows arbitrary per-sample stores.
 +
* ARB_separate_shader_objects created and brought into 4.1 core. Allows separation of programs based on shader stage. Allows user-defined varyings.
 +
* ARB_get_program_binary created and brought into 4.1 core. Allows extraction of binary compiled shaders, and the later reuse of those shaders without having to re-compile and link.
 +
* ARB_copy_image created and brought into 4.3 core. Allows copying images on the GPU, with some format manipulation.

Latest revision as of 20:37, 4 January 2013

This page contains users's wish lists for features and functionality in future versions of OpenGL. It should not in any way be taken as an endorsement by the ARB, nor should it be assumed that any future version of OpenGL will have these or anything like them. The order of these features is arbitrary.

Ability to select texture origin

Description: The ability to select the origin of textures, either top-left or bottom-left. Essentially this is ARB_fragment_coord_conventions for textures.

Benefit: Improved interoperability with DirectX.

GPU-specific off-screen context creation

Description: The ability to create rendering contexts for different GPUs, particularly for off-screen rendering. GL_AMD_GPU_association might be a good example for this.

Benefit: Improved parallelism.

Offline GLSL shader compilation.

Description: Tool or library for offline compilation of GLSL shaders. This would compile into a standartized bytecode-like format (like D3D does). A standartized cross-platform shader binary format would be needed, related to GL_ARB_get_program_binary.

Benefit: A quality offline compiler that does a known set of optimizations would remove a big failure point in existing OpenGL implementations (you never know if GLSL compiler in the driver has some parsing bugs; or whether it performs common cross-platform optimizations at all). On some platforms (mostly mobile), having full robust compiler in the driver could even be prohibitive (a megabyte or so of code).

OpenGL.org Topic

Multithreading.

Description: This is a synthesis of several proposals, but the basic idea is to be able to do things that don't directly involve rendering in a separate thread. Compiling/linking programs, uploading textures, etc.

Benefit: Better performance.

Conformance test.

Description: A comprehensive OpenGL conformance test. One that can be run on various OpenGL implementations to test for driver bugs and the like.

Benefit: A benchmark for IHVs to work towards in making sure their drivers do not have certain bugs.

piglit is an open-source OpenGL test suite

Debug Profile.

Description: A profile of OpenGL that exposes debugging information, such as logs of calls and so forth.

Benefit: Makes OpenGL easier to debug.

Update: The GL_ARB_debug_output extension implements debug information.

Performance metrics

Description: Exposing various performance metrics through the API, possibly as a profiling... em, profile. Specifically, providing at least the informathion that D3D10's profiling functionality provides.

Benefit: Profiling for improved performance.

Direct State Access

Description: Something not entirely unlike the EXT_direct_state_access extension. It would allow users to use objects without binding them to the context.

Benefit: The biggest benefit is getting past the 32 texture limitation imposed by the way texture objects are attached to programs. It would also make Vertex Array Objects a lot more intuitive and less confusing to understand.

Promote EXT_texture_filter_anisotrpoic and EXT_texture_compression_s3tc to core.

Description: See above.

Benefit: I guess it makes some people feel better about using these enums to not write "EXT" at the end.

D3D11-style command buffers.

Description: The ability to queue up a sequence of rendering commands and execute them in a single go. Sort of like display lists, but limited to actually rendering things rather than setting state.

Benefit: Presumably faster rendering of set pieces of geometry.

Bindless graphics stuff.

Description: Implement some of the concepts in the various bindless graphics extensions. Or if not that, then improve/fix whatever it is in VAOs and/or VBOs that make their cache performance so much worse than bindless graphics.

Benefit: Performance, it would seem.

OpenGL.org topic

Blend shaders

Description: The ability to write shaders for blend operations.

Benefit: Non-linear color encoding with proper blending, shadow mapping improvements, etc.

OpenGL.org topic

Object purgability.

Description: The ability to designate that the storage for some objects does not need to be preserved (per APPLE_object_purgibility). This may combine with bindless/VAR fixes (ie: locking buffers); the totality of which gives the application significant control over how memory gets used.

Benefit: Improved performance.

Opaque object types

Description: ARB_sync uses a pointer type, the first OpenGL object to do so. If the direct state access thing is going to go through, that will create a lot of new APIs in and of itself. You may as well add the ability to convert a GLuint into an object pointer while you're at it. The DSA functions would operate only on pointers, while the binding functions would need analogs that also take pointers.

Benefit: Better 64-bit support. Possibly faster binding and rendering.

Program state separation

Description: Some mechanism to fully separate a program object data from the state for that object. Right now, UBOs allow you to separate a program from most of its state, but texture state remains fixed.

Benefit: Faster state changes when using the same program data.

Benefit: Orthogonality and efficiency

glGetString( GL_DRIVER_VERSION )

Description: Returns a string which, for the vendor returned by glGetString( GL_VENDOR ), uniquely identifies the exact driver version in use. eg. "6.14.11.9038"

Benefit: When an application fails on an end-users machine, the graphics driver version can be written to the log file. The application can also detect a particular driver version that contains a known bug, and either work around it or ask user to upgrade.

Vertical Sync event

Description: An event object that is set by VSYNC and cleared when read.

Benefit: SwapBuffers currently does 3 things, it flushes the command queue to GPU, adds a command to command queue that swaps the front and back buffers, and optionally waits for VSync. A Vsync event allows OpenCL or FBO rendering tasks to be performed between the SwapBuffers call and the VSync, ie. when the GPU would otherwise just be waiting.

Dropped frame counter.

Description: Query object that returns the number of VSync's that occured between the last two SwapBuffer calls.

Benefit: Allows detection of dropped frames so that the application can reduce the workload on the GPU.

Prioritised command queues.

Description: The frame-buffer rendering context is given priority over FBO rendering and OpenCL tasks. FBO and OpenCL command queues are executed when the GPU has executed SwapBuffers in the primary context and is waiting for VSync. When VSync occurs and the front/back buffers are swapped, the currently executing command queue is suspended as soon as possible, and commands waiting in the primary context are executed.

Benefit: Prevents dropped frames from causing jittery animation. Using an old shadow-map for one frame is much less noticable than the entire screen freezing. This also ensures that the GPU does not stall due to lack of work, which would be likely if using a VSync event object to try to control the GPU workload with the CPU.

Immutable objects

Description: Some kind of immutable objects to replace display for thing like macro object of set of states. Kind of "blend object", "test object", "rasterizer object", that king of things which is really efficient to do!

It could be either a single generic immutable state object or dedicated objects ... It replaces display list in these cases when used as a macro immutable object. I pretty sure that drivers wise, it's a good idea too.

Benefit: Simplify states management by the OpenGL software. Replace deprecated display lists when used for such purpose. Simplify drivers?

Depth range equation

Description: glDepthRangeEquation(GLclampf a, GLclampf b), where the final Zw = a + b Zd and a + b = 1. For DX, a = 0 and b = 1. For GL it's business as usual with a=(n+f)/2 and b=(f-n)/2 (usually a=1/2 and b=1/2). It now seems somewhat inappropriate to make assumptions about quantities derived from them (i.e. that post projective z is in the range [-1, 1])

Benefit: In keeping with the recent effort to ease transitions from DX

OpenGL.org topic

GLSL diagonal matrices

Description: Replace code like: vec4 v; mat4 m; ... m[0][0] = v.x; m[1][1] = v.y; m[2][2] = v.z; m[3][3] = v.w;

by mat4 m(v);

Benefit: Simplify the code

NV_texture_barrier in core

Description: Move NV_texture_barrier into core.

Benefit: Allows a limited form of programmable blending.

Real-time rendering mode

Description: Garantees predictable and well-defined performance metrics for shaders, memory management, etc. For instance, in this mode, it would be prohibited for shaders to recompile during a draw call when some of their inputs take some specific characteristics. Well-defined memory management would mean that the application is able to control when buffers (textures, VBOs) are actually sent to VRAM.

Benefit: This would avoid large and unpredictable performance drops which are at best very difficult to avoid. It would also provide better application control of the data transfered to the GPU to make predictable subbanded streaming of data more reliable.

Fulfilled Wishes

These are things that used to be on this list that have been resolved by the ARB, either in ARB extensions or GL core:

  • ARB_instanced_arrays brought into the 3.3 core.
  • ARB_texture_swizzling brought into the 3.3 core.
  • ARB_tessellation_shader created and brought into 4.0 core. Provides access to tessellation shaders.
  • ARB_blend_func_extended created and brought into 3.3 core. Allows a special blend mode where the blend equation takes 2 colors from the fragment program instead of one.
  • ARB_explicit_attrib_location created and brought into 3.3 core. Allows specifying attribute locations and fragment output locations in the shader.
  • ARB_sampler_objects created and brought into 3.3 core. Allows separation between sampler state and texture object state.
  • ARB_shader_bit_encoding created and brought into GLSL 3.3. Allows converting floating point number to to integers representing their bit encoding.
  • ARB_sample_shading brought into GLSL 4.00. Allows a write mask for sample writes. Also ARB_image_load_store allows arbitrary per-sample stores.
  • ARB_separate_shader_objects created and brought into 4.1 core. Allows separation of programs based on shader stage. Allows user-defined varyings.
  • ARB_get_program_binary created and brought into 4.1 core. Allows extraction of binary compiled shaders, and the later reuse of those shaders without having to re-compile and link.
  • ARB_copy_image created and brought into 4.3 core. Allows copying images on the GPU, with some format manipulation.