Difference between revisions of "User Wish List"
|Line 124:||Line 124:|
'''Benefit:''' More shader file diversity, without having to "pre-parse" the shader.
'''Benefit:''' More shader file diversity, without having to "pre-parse" the shader.
== Object purgability. ==
== Object purgability. ==
Revision as of 19:48, 11 March 2010
This page contains users's wish lists for features and functionality in future versions of OpenGL. It should not in any way be taken as an endorsement by the ARB, nor should it be assumed that any future version of OpenGL will have these or anything like them. The order of these features is arbitrary.
Compiled Shader Caching
Description: The ability to store compiled shaders in some format, so that subsequent executions of the programs will not require a full compile/link step. Or at least, will not require it unless drivers have changed.
Benefit: Improve program initialization/level loading (if shaders are part of level data).
GPU-specific off-screen context creation
Description: The ability to create rendering contexts for different GPUs, particularly for off-screen rendering. GL_AMD_GPU_association might be a good example for this.
Benefit: Improved parallelism.
Decouple texture filtering state from texture objects
Description: This involves being able to separate texture objects (the images themselves) from the state that deals with how shaders sample this. This could be done by creating a new object type, allowing sampler objects in GLSL to specify filtering state, or some other mechanism.
Benefit: Allows using texture objects in different ways in different places.
Update: This has been partially addressed in the GL_ARB_sampler_objects extension, but fails to include any interaction with GLSL.
GLSL shader precompilation.
Description: There was no clear propsal for what this means, but it was mentioned several times in the thread. This would appear to mean having an off-line tool that compiles GLSL into something that you can feed into any implementation to create programs. This something would likely be a form of the ARB assembly that supports modern functionality.
Benefit: Presumably, the compile/link time of this precompiled format would be lower than that of GLSL itself.
Description: This is a synthesis of several proposals, but the basic idea is to be able to do things that don't directly involve rendering in a separate thread. Compiling/linking programs, uploading textures, etc.
Benefit: Better performance.
Description: A comprehensive OpenGL conformance test. One that can be run on various OpenGL implementations to test for driver bugs and the like.
Benefit: A benchmark for IHVs to work towards in making sure their drivers do not have certain bugs.
Description: A profile of OpenGL that exposes debugging information, such as logs of calls and so forth.
Benefit: Makes OpenGL easier to debug.
Use of "semantics" in GLSL.
Description: The ability to specify in GLSL what vertex attribute index or fragment output index a particular vertex shader input or fragment shader output uses.
Benefit: API cleanup. No need for the user to externally provide mappings for these anymore.
Update: GL_ARB_explicit_attrib_location addresses this issue.
Description: Exposing various performance metrics through the API, possibly as a profiling... em, profile. Specifically, providing at least the informathion that D3D10's profiling functionality provides.
Benefit: Profiling for improved performance.
Raw bit conversion in GLSL
Description: The ability to convert the bitpattern of an integer into a float and vice-versa. Basically "reinterpret_cast" for GLSL.
Benefit: Data compression, mainly.
Separate shader objects.
Description: The ability to attach multiple programs together, ala EXT_separate_shader_objects. Only unlike that extension, it should actually work with user-defined in/out variables, instead of being restricted to just pre-defined ones.
Benefit: Everything that EXT_separate_shader_objects says about why the extension exists.
Direct State Access
Description: Something not entirely unlike the EXT_direct_state_access extension. It would allow users to use objects without binding them to the context.
Benefit: The biggest benefit is getting past the 32 texture limitation imposed by the way texture objects are attached to programs. It would also make Vertex Array Objects a lot more intuitive and less confusing to understand.
Promote EXT_texture_filter_anisotrpoic and EXT_texture_compression_s3tc to core.
Description: See above.
Benefit: I guess it makes some people feel better about using these enums to not write "EXT" at the end.
Write to the blend color inside the shader.
Description: There is a constant blend color that can be used as part of the blend equations. This feature would allow a fragment shader to modify this color on a per-fragment basis.
Benefit: Not stated, though it basically allows you to have 2 input colors in your blend equations. This allows to modify the blend-equation without the need to use the alpha-value of the fragment as the modification factor. This in turn allows to use the alpha-channel for useful things and does not force me to waste it simply to be able to modify the blend-equation on a per-fragment basis from within the shader.
Update: GL_ARB_blend_func_extended addresses this.
D3D11-style command buffers.
Description: The ability to queue up a sequence of rendering commands and execute them in a single go. Sort of like display lists, but limited to actually rendering things rather than setting state.
Benefit: Presumably faster rendering of set pieces of geometry.
Bindless graphics stuff.
Description: Implement some of the concepts in the various bindless graphics extensions. Or if not that, then improve/fix whatever it is in VAOs and/or VBOs that make their cache performance so much worse than bindless graphics.
Benefit: Performance, it would seem.
Description: The ability to write shaders for blend operations.
Benefit: Non-linear color encoding with proper blending, shadow mapping improvements, etc.
Includes in shaders
Description: The ability to have #include-type behavior in shader programs. Presumably the application would provide a callback or some such during compilation that would return a string for the "file" requested in the #include.
Benefit: More shader file diversity, without having to "pre-parse" the shader.
Update: GL_ARB_shading_language_include addresses this.
Description: The ability to designate that the storage for some objects does not need to be preserved (per APPLE_object_purgibility). This may combine with bindless/VAR fixes (ie: locking buffers); the totality of which gives the application significant control over how memory gets used.
Benefit: Improved performance.
Opaque object types
Description: ARB_sync uses a pointer type, the first OpenGL object to do so. If the direct state access thing is going to go through, that will create a lot of new APIs in and of itself. You may as well add the ability to convert a GLuint into an object pointer while you're at it. The DSA functions would operate only on pointers, while the binding functions would need analogs that also take pointers.
Benefit: Better 64-bit support. Possibly faster binding and rendering.
Program state separation
Description: Some mechanism to fully separate a program object data from the state for that object. Right now, UBOs allow you to separate a program from most of its state, but texture state remains fixed.
Benefit: Faster state changes when using the same program data.
on device image copies
Benefit: Orthogonality and efficiency
write to specific samples within a shader
Description: ARB_texture_multisample (core in 3.2) allows fetching a specific sample from a multisampled buffer. It would be nice if there was a way to write to specific samples from within the shader.
Benefit: Potentially fewer passes when anti-aliasing in a deferred renderer.
glGetString( GL_DRIVER_VERSION )
Description: Returns a string which, for the vendor returned by glGetString( GL_VENDOR ), uniquely identifies the exact driver version in use. eg. "184.108.40.20638"
Benefit: When an application fails on an end-users machine, the graphics driver version can be written to the log file. The application can also detect a particular driver version that contains a known bug, and either work around it or ask user to upgrade.
Vertical Sync event
Description: An event object that is set by VSYNC and cleared when read.
Benefit: SwapBuffers currently does 3 things, it flushes the command queue to GPU, adds a command to command queue that swaps the front and back buffers, and optionally waits for VSync. A Vsync event allows OpenCL or FBO rendering tasks to be performed between the SwapBuffers call and the VSync, ie. when the GPU would otherwise just be waiting.
Dropped frame counter.
Description: Query object that returns the number of VSync's that occured between the last two SwapBuffer calls.
Benefit: Allows detection of dropped frames so that the application can reduce the workload on the GPU.
Prioritised command queues.
Description: The frame-buffer rendering context is given priority over FBO rendering and OpenCL tasks. FBO and OpenCL command queues are executed when the GPU has executed SwapBuffers in the primary context and is waiting for VSync. When VSync occurs and the front/back buffers are swapped, the currently executing command queue is suspended as soon as possible, and commands waiting in the primary context are executed.
Benefit: Prevents dropped frames from causing jittery animation. Using an old shadow-map for one frame is much less noticable than the entire screen freezing. This also ensures that the GPU does not stall due to lack of work, which would be likely if using a VSync event object to try to control the GPU workload with the CPU.
Description: Support for tesselation hardware on next generation GPU's
Benefit: Reduces required bandwidth between the CPU and GPU. Allows smoothly curved silhouettes to be generated and more complex surface detail than can be done with bump mapping.
EXT_texture_swizzle as core.
Description: Move texture_swizzle into core
Benefit: Allows the effective recreation of GL_LUMINANCE and GL_INTENSITY.
Per-instances vertex attributes
Description: D3D10 is capable of specifying a vertex attribute that changes per-instance, rather than per-vertex.
Benefit: Closer feature parity with D3D10
Description: Some kind of immutable objects to replace display for thing like macro object of set of states. Kind of "blend object", "test object", "rasterizer object", that king of things which is really efficient to do!
It could be either a single generic immutable state object or dedicated objects ... It replaces display list in these cases when used as a macro immutable object. I pretty sure that drivers wise, it's a good idea too.
Benefit: Simplify states management by the OpenGL software. Replace deprecated display lists when used for such purpose. Simplify drivers?
Depth range equation
Description: glDepthRangeEquation(GLclampf a, GLclampf b), where the final Zw = a + b Zd and a + b = 1. For DX, a = 0 and b = 1. For GL it's business as usual with a=(n+f)/2 and b=(f-n)/2 (usually a=1/2 and b=1/2). It now seems somewhat inappropriate to make assumptions about quantities derived from them (i.e. that post projective z is in the range [-1, 1])
Benefit: In keeping with the recent effort to ease transitions from DX
GLSL diagonal matrices
Description: Replace code like: vec4 v; mat4 m; ... m = v.x; m = v.y; m = v.z; m = v.w;
by mat4 m(v);
Benefit: Simplify the code
NV_texture_barrier in core
Description: Move NV_texture_barrier into core.
Benefit: Allows a limited form of programmable blending.
Real-time rendering mode
Description: Garantees predictable and well-defined performance metrics for shaders, memory management, etc. For instance, in this mode, it would be prohibited for shaders to recompile during a draw call when some of their inputs take some specific characteristics. Well-defined memory management would mean that the application is able to control when buffers (textures, VBOs) are actually sent to VRAM.
Benefit: This would avoid large and unpredictable performance drops which are at best very difficult to avoid. It would also provide better application control of the data transfered to the GPU to make predictable subbanded streaming of data more reliable.