User Wish List: Difference between revisions
(initial version) |
No edit summary |
||
Line 1: | Line 1: | ||
== Compiled Shader Caching == | |||
'''Description:''' The ability to store compiled shaders in some format, so that subsequent executions of the programs will not require a full compile/link step. Or at least, will not require it unless drivers have changed. | |||
'''Benefit:''' Improve program initialization/level loading (if shaders are part of level data). | |||
== GPU-specific off-screen context creation == | |||
'''Description:''' The ability to create rendering contexts for different GPUs, particularly for off-screen rendering. GL_AMD_GPU_association might be a good example for this. | |||
'''Benefit:''' Improved parallelism. | |||
== Decouple texture filtering state from texture objects. == | |||
'''Description:''' This involves being able to separate texture objects (the images themselves) from the state that deals with how shaders sample this. This could be done by creating a new object type, allowing sampler objects in GLSL to specify filtering state, or some other mechanism. | |||
'''Benefit:''' Allows using texture objects in different ways in different places. | |||
== GLSL shader precompilation. == | |||
Description: There was no clear propsal for what this means, but it was mentioned several times in the thread. This would appear to mean having an off-line tool that compiles GLSL into something that you can feed into any implementation to create programs. This something would likely be a form of the ARB assembly that supports modern functionality. | '''Description:''' There was no clear propsal for what this means, but it was mentioned several times in the thread. This would appear to mean having an off-line tool that compiles GLSL into something that you can feed into any implementation to create programs. This something would likely be a form of the ARB assembly that supports modern functionality. | ||
Benefit: Presumably, the compile/link time of this precompiled format would be lower than that of GLSL itself. | '''Benefit:''' Presumably, the compile/link time of this precompiled format would be lower than that of GLSL itself. | ||
== Multithreading. == | |||
Description: This is a synthesis of several proposals, but the basic idea is to be able to do things that don't directly involve rendering in a separate thread. Compiling/linking programs, uploading textures, etc. | '''Description:''' This is a synthesis of several proposals, but the basic idea is to be able to do things that don't directly involve rendering in a separate thread. Compiling/linking programs, uploading textures, etc. | ||
Benefit: Better performance. | '''Benefit:''' Better performance. | ||
== Conformance test. == | |||
'''Description:''' A comprehensive OpenGL conformance test. One that can be run on various OpenGL implementations to test for driver bugs and the like. | |||
'''Benefit:''' A benchmark for IHVs to work towards in making sure their drivers do not have certain bugs. | |||
== Debug Profile. == | |||
'''Description:''' A profile of OpenGL that exposes debugging information, such as logs of calls and so forth. | |||
'''Benefit:''' Makes OpenGL easier to debug. | |||
== Use of "semantics" in GLSL. == | |||
'''Description:''' The ability to specify in GLSL what vertex attribute index or fragment output index a particular vertex shader input or fragment shader output uses. | |||
'''Benefit:''' API cleanup. No need for the user to externally provide mappings for these anymore. | |||
== Performance metrics == | |||
Description: | '''Description:''' Exposing various performance metrics through the API, possibly as a profiling... em, profile. Specifically, providing at least the informathion that D3D10's profiling functionality provides. | ||
Benefit: | '''Benefit:''' Profiling for improved performance. | ||
== Raw bit conversion in GLSL == | |||
'''Description:''' The ability to convert the bitpattern of an integer into a float and vice-versa. Basically "reinterpret_cast" for GLSL. | |||
'''Benefit:''' Data compression, mainly. | |||
== Separate shader objects. == | |||
'''Description:''' The ability to attach multiple programs together, ala EXT_separate_shader_objects. Only unlike that extension, it should actually work with user-defined in/out variables, instead of being restricted to just pre-defined ones. | |||
'''Benefit:''' Everything that EXT_separate_shader_objects says about why the extension exists. | |||
== Direct State Access == | |||
Description: | '''Description:''' Something not entirely unlike the EXT_direct_state_access extension. It would allow users to use objects without binding them to the context. | ||
Benefit: Not stated, though it basically allows you to have 2 input colors in your blend equations. | '''Benefit:''' The biggest benefit is getting past the 32 texture limitation imposed by the way texture objects are attached to programs. It would also make Vertex Array Objects a lot more intuitive and less confusing to understand. | ||
== Promote EXT_texture_filter_anisotrpoic and EXT_texture_compression_s3tc to core. == | |||
'''Description:''' See above. | |||
'''Benefit:''' I guess it makes some people feel better about using these enums to not write "EXT" at the end. | |||
== Write to the blend color inside the shader. == | |||
'''Description:''' There is a constant blend color that can be used as part of the blend equations. This feature would allow a fragment shader to modify this color on a per-fragment basis. | |||
'''Benefit:''' Not stated, though it basically allows you to have 2 input colors in your blend equations. | |||
This allows to modify the blend-equation without the need to use the alpha-value of the fragment as the modification factor. This in turn allows to use the alpha-channel for useful things and does not force me to waste it simply to be able to modify the blend-equation on a per-fragment basis from within the shader. | This allows to modify the blend-equation without the need to use the alpha-value of the fragment as the modification factor. This in turn allows to use the alpha-channel for useful things and does not force me to waste it simply to be able to modify the blend-equation on a per-fragment basis from within the shader. | ||
Description: The ability to queue up a sequence of rendering commands and execute them in a single go. Sort of like display lists, but limited to actually rendering things rather than setting state. | == D3D11-style command buffers. == | ||
'''Description:''' The ability to queue up a sequence of rendering commands and execute them in a single go. Sort of like display lists, but limited to actually rendering things rather than setting state. | |||
Benefit: Presumably faster rendering of set pieces of geometry. | '''Benefit:''' Presumably faster rendering of set pieces of geometry. | ||
== Bindless graphics stuff. == | |||
'''Description:''' Implement some of the concepts in the various bindless graphics extensions. Or if not that, then improve/fix whatever it is in VAOs and/or VBOs that make their cache performance so much worse than bindless graphics. | |||
'''Benefit:''' Performance, it would seem. | |||
== Blend shaders == | |||
'''Description:''' The ability to write shaders for blend operations. | |||
'''Benefit:''' Adding lots of new behaviors. | |||
== Includes in shaders == | |||
Description: The ability to | '''Description:''' The ability to have #include-type behavior in shader programs. Presumably the application would provide a callback or some such during compilation that would return a string for the "file" requested in the #include. | ||
Benefit: | '''Benefit:''' More shader file diversity, without having to "pre-parse" the shader. | ||
== Object purgability. == | |||
'''Description:''' The ability to designate that the storage for some objects does not need to be preserved (per APPLE_object_purgibility). This may combine with bindless/VAR fixes (ie: locking buffers); the totality of which gives the application significant control over how memory gets used. | |||
'''Benefit:''' Improved performance. | |||
== Opaque object types == | |||
'''Description:''' ARB_sync uses a pointer type, the first OpenGL object to do so. If the direct state access thing is going to go through, that will create a lot of new APIs in and of itself. You may as well add the ability to convert a GLuint into an object pointer while you're at it. The DSA functions would operate only on pointers, while the binding functions would need analogs that also take pointers. | |||
'''Benefit:''' Better 64-bit support. Possibly faster binding and rendering. | |||
== Program state separation == | |||
Description: Some mechanism to fully separate a program object data from the state for that object. Right now, UBOs allow you to separate a program from most of its state, but texture state remains fixed. | Description: Some mechanism to fully separate a program object data from the state for that object. Right now, UBOs allow you to separate a program from most of its state, but texture state remains fixed. | ||
Benefit: Faster state changes when using the same program data. | '''Benefit:''' Faster state changes when using the same program data. | ||
== on device image copies == | |||
http://www.opengl.org/registry/specs/NV/copy_image.txt | http://www.opengl.org/registry/specs/NV/copy_image.txt | ||
Benefit: Orthogonality and efficiency | '''Benefit:''' Orthogonality and efficiency | ||
== write to specific samples within a shader == | |||
'''Description:''' ARB_texture_multisample (core in 3.2) allows fetching a specific sample from a multisampled buffer. It would be nice if there was a way to write to specific samples from within the shader. | |||
write to specific samples within | |||
'''Benefit:''' Potentially fewer passes when anti-aliasing in a deferred renderer. | |||
== glGetString( GL_DRIVER_VERSION ) == | |||
Description: Returns a string which, for the vendor returned by glGetString( GL_VENDOR ), uniquely identifies the exact driver version in use. eg. "6.14.11.9038" | '''Description:''' Returns a string which, for the vendor returned by glGetString( GL_VENDOR ), uniquely identifies the exact driver version in use. eg. "6.14.11.9038" | ||
Benefit: When an application fails on an end-users machine, the graphics driver version can be written to the log file. | '''Benefit:''' When an application fails on an end-users machine, the graphics driver version can be written to the log file. | ||
The application can also detect a particular driver version that contains a known bug, and either work around it or ask user to upgrade. | The application can also detect a particular driver version that contains a known bug, and either work around it or ask user to upgrade. | ||
Description: An event object that is set by VSYNC and cleared when read. | == Vertical Sync event == | ||
'''Description:''' An event object that is set by VSYNC and cleared when read. | |||
Benefit: SwapBuffers currently does 3 things, it flushes the command queue to GPU, adds a command to command queue that swaps the front and back buffers, and optionally waits for VSync. | Benefit: SwapBuffers currently does 3 things, it flushes the command queue to GPU, adds a command to command queue that swaps the front and back buffers, and optionally waits for VSync. | ||
A Vsync event allows OpenCL or FBO rendering tasks to be performed between the SwapBuffers call and the VSync, ie. when the GPU would otherwise just be waiting. | A Vsync event allows OpenCL or FBO rendering tasks to be performed between the SwapBuffers call and the VSync, ie. when the GPU would otherwise just be waiting. | ||
== Dropped frame counter. == | |||
Benefit: Allows detection of dropped frames so that the application can reduce the workload on the GPU. | '''Description:''' Query object that returns the number of VSync's that occured between the last two SwapBuffer calls. | ||
''' | |||
Benefit:''' Allows detection of dropped frames so that the application can reduce the workload on the GPU. | |||
== Prioritised command queues. == | |||
Description: The frame-buffer rendering context is given priority over FBO rendering and OpenCL tasks. | '''Description:''' The frame-buffer rendering context is given priority over FBO rendering and OpenCL tasks. | ||
FBO and OpenCL command queues are executed when the GPU has executed SwapBuffers in the primary context and is waiting for VSync. | FBO and OpenCL command queues are executed when the GPU has executed SwapBuffers in the primary context and is waiting for VSync. | ||
When VSync occurs and the front/back buffers are swapped, the currently executing command queue is suspended as soon as possible, and commands waiting in the primary context are executed. | When VSync occurs and the front/back buffers are swapped, the currently executing command queue is suspended as soon as possible, and commands waiting in the primary context are executed. | ||
Benefit: Prevents dropped frames from causing jittery animation. Using an old shadow-map for one frame is much less noticable than the entire screen freezing. | '''Benefit:''' Prevents dropped frames from causing jittery animation. Using an old shadow-map for one frame is much less noticable than the entire screen freezing. | ||
This also ensures that the GPU does not stall due to lack of work, which would be likely if using a VSync event object to try to control the GPU workload with the CPU. | This also ensures that the GPU does not stall due to lack of work, which would be likely if using a VSync event object to try to control the GPU workload with the CPU. | ||
Description: Support for tesselation hardware on next generation GPU's | == GL_ARB_Tesselation extension == | ||
'''Description:''' Support for tesselation hardware on next generation GPU's | |||
Benefit: Reduces required bandwidth between the CPU and GPU. | '''Benefit:''' Reduces required bandwidth between the CPU and GPU. | ||
Allows smoothly curved silhouettes to be generated and more complex surface detail than can be done with bump mapping. | Allows smoothly curved silhouettes to be generated and more complex surface detail than can be done with bump mapping. | ||
Description: Each material has a state object that defines which shaders, textures, samplers and uniforms are used to render it. | == Program state objects. == | ||
'''Description:''' Each material has a state object that defines which shaders, textures, samplers and uniforms are used to render it. | |||
Changing materials is done by selecting a new state object instead of having to issue multiple commands that have high driver overhead for verification. | Changing materials is done by selecting a new state object instead of having to issue multiple commands that have high driver overhead for verification. | ||
Benefit: Display lists are no longer required for fast material changes. | '''Benefit:''' Display lists are no longer required for fast material changes. | ||
EXT_texture_swizzle as core. | |||
== EXT_texture_swizzle as core. == | |||
'''Description:''' Move texture_swizzle into core |
Revision as of 08:39, 18 September 2009
Compiled Shader Caching
Description: The ability to store compiled shaders in some format, so that subsequent executions of the programs will not require a full compile/link step. Or at least, will not require it unless drivers have changed.
Benefit: Improve program initialization/level loading (if shaders are part of level data).
GPU-specific off-screen context creation
Description: The ability to create rendering contexts for different GPUs, particularly for off-screen rendering. GL_AMD_GPU_association might be a good example for this.
Benefit: Improved parallelism.
Decouple texture filtering state from texture objects.
Description: This involves being able to separate texture objects (the images themselves) from the state that deals with how shaders sample this. This could be done by creating a new object type, allowing sampler objects in GLSL to specify filtering state, or some other mechanism.
Benefit: Allows using texture objects in different ways in different places.
GLSL shader precompilation.
Description: There was no clear propsal for what this means, but it was mentioned several times in the thread. This would appear to mean having an off-line tool that compiles GLSL into something that you can feed into any implementation to create programs. This something would likely be a form of the ARB assembly that supports modern functionality.
Benefit: Presumably, the compile/link time of this precompiled format would be lower than that of GLSL itself.
Multithreading.
Description: This is a synthesis of several proposals, but the basic idea is to be able to do things that don't directly involve rendering in a separate thread. Compiling/linking programs, uploading textures, etc.
Benefit: Better performance.
Conformance test.
Description: A comprehensive OpenGL conformance test. One that can be run on various OpenGL implementations to test for driver bugs and the like.
Benefit: A benchmark for IHVs to work towards in making sure their drivers do not have certain bugs.
Debug Profile.
Description: A profile of OpenGL that exposes debugging information, such as logs of calls and so forth.
Benefit: Makes OpenGL easier to debug.
Use of "semantics" in GLSL.
Description: The ability to specify in GLSL what vertex attribute index or fragment output index a particular vertex shader input or fragment shader output uses.
Benefit: API cleanup. No need for the user to externally provide mappings for these anymore.
Performance metrics
Description: Exposing various performance metrics through the API, possibly as a profiling... em, profile. Specifically, providing at least the informathion that D3D10's profiling functionality provides.
Benefit: Profiling for improved performance.
Raw bit conversion in GLSL
Description: The ability to convert the bitpattern of an integer into a float and vice-versa. Basically "reinterpret_cast" for GLSL.
Benefit: Data compression, mainly.
Separate shader objects.
Description: The ability to attach multiple programs together, ala EXT_separate_shader_objects. Only unlike that extension, it should actually work with user-defined in/out variables, instead of being restricted to just pre-defined ones.
Benefit: Everything that EXT_separate_shader_objects says about why the extension exists.
Direct State Access
Description: Something not entirely unlike the EXT_direct_state_access extension. It would allow users to use objects without binding them to the context.
Benefit: The biggest benefit is getting past the 32 texture limitation imposed by the way texture objects are attached to programs. It would also make Vertex Array Objects a lot more intuitive and less confusing to understand.
Promote EXT_texture_filter_anisotrpoic and EXT_texture_compression_s3tc to core.
Description: See above.
Benefit: I guess it makes some people feel better about using these enums to not write "EXT" at the end.
Write to the blend color inside the shader.
Description: There is a constant blend color that can be used as part of the blend equations. This feature would allow a fragment shader to modify this color on a per-fragment basis.
Benefit: Not stated, though it basically allows you to have 2 input colors in your blend equations. This allows to modify the blend-equation without the need to use the alpha-value of the fragment as the modification factor. This in turn allows to use the alpha-channel for useful things and does not force me to waste it simply to be able to modify the blend-equation on a per-fragment basis from within the shader.
D3D11-style command buffers.
Description: The ability to queue up a sequence of rendering commands and execute them in a single go. Sort of like display lists, but limited to actually rendering things rather than setting state.
Benefit: Presumably faster rendering of set pieces of geometry.
Bindless graphics stuff.
Description: Implement some of the concepts in the various bindless graphics extensions. Or if not that, then improve/fix whatever it is in VAOs and/or VBOs that make their cache performance so much worse than bindless graphics.
Benefit: Performance, it would seem.
Blend shaders
Description: The ability to write shaders for blend operations.
Benefit: Adding lots of new behaviors.
Includes in shaders
Description: The ability to have #include-type behavior in shader programs. Presumably the application would provide a callback or some such during compilation that would return a string for the "file" requested in the #include.
Benefit: More shader file diversity, without having to "pre-parse" the shader.
Object purgability.
Description: The ability to designate that the storage for some objects does not need to be preserved (per APPLE_object_purgibility). This may combine with bindless/VAR fixes (ie: locking buffers); the totality of which gives the application significant control over how memory gets used.
Benefit: Improved performance.
Opaque object types
Description: ARB_sync uses a pointer type, the first OpenGL object to do so. If the direct state access thing is going to go through, that will create a lot of new APIs in and of itself. You may as well add the ability to convert a GLuint into an object pointer while you're at it. The DSA functions would operate only on pointers, while the binding functions would need analogs that also take pointers.
Benefit: Better 64-bit support. Possibly faster binding and rendering.
Program state separation
Description: Some mechanism to fully separate a program object data from the state for that object. Right now, UBOs allow you to separate a program from most of its state, but texture state remains fixed.
Benefit: Faster state changes when using the same program data.
on device image copies
http://www.opengl.org/registry/specs/NV/copy_image.txt
Benefit: Orthogonality and efficiency
write to specific samples within a shader
Description: ARB_texture_multisample (core in 3.2) allows fetching a specific sample from a multisampled buffer. It would be nice if there was a way to write to specific samples from within the shader.
Benefit: Potentially fewer passes when anti-aliasing in a deferred renderer.
glGetString( GL_DRIVER_VERSION )
Description: Returns a string which, for the vendor returned by glGetString( GL_VENDOR ), uniquely identifies the exact driver version in use. eg. "6.14.11.9038"
Benefit: When an application fails on an end-users machine, the graphics driver version can be written to the log file. The application can also detect a particular driver version that contains a known bug, and either work around it or ask user to upgrade.
Vertical Sync event
Description: An event object that is set by VSYNC and cleared when read. Benefit: SwapBuffers currently does 3 things, it flushes the command queue to GPU, adds a command to command queue that swaps the front and back buffers, and optionally waits for VSync. A Vsync event allows OpenCL or FBO rendering tasks to be performed between the SwapBuffers call and the VSync, ie. when the GPU would otherwise just be waiting.
Dropped frame counter.
Description: Query object that returns the number of VSync's that occured between the last two SwapBuffer calls. Benefit: Allows detection of dropped frames so that the application can reduce the workload on the GPU.
Prioritised command queues.
Description: The frame-buffer rendering context is given priority over FBO rendering and OpenCL tasks. FBO and OpenCL command queues are executed when the GPU has executed SwapBuffers in the primary context and is waiting for VSync. When VSync occurs and the front/back buffers are swapped, the currently executing command queue is suspended as soon as possible, and commands waiting in the primary context are executed.
Benefit: Prevents dropped frames from causing jittery animation. Using an old shadow-map for one frame is much less noticable than the entire screen freezing. This also ensures that the GPU does not stall due to lack of work, which would be likely if using a VSync event object to try to control the GPU workload with the CPU.
GL_ARB_Tesselation extension
Description: Support for tesselation hardware on next generation GPU's
Benefit: Reduces required bandwidth between the CPU and GPU. Allows smoothly curved silhouettes to be generated and more complex surface detail than can be done with bump mapping.
Program state objects.
Description: Each material has a state object that defines which shaders, textures, samplers and uniforms are used to render it. Changing materials is done by selecting a new state object instead of having to issue multiple commands that have high driver overhead for verification.
Benefit: Display lists are no longer required for fast material changes.
EXT_texture_swizzle as core.
Description: Move texture_swizzle into core