Rendering Pipeline Overview
The Rendering Pipeline is the sequence of steps that OpenGL takes when rendering objects. This overview will provide a high-level description of the steps in the pipeline.
The OpenGL rendering pipeline is initiated when you perform a rendering operation. Rendering operations require the presence of a properly-defined vertex array object and a linked Program Object or Program Pipeline Object which provides the shaders for the programmable pipeline stages.
Once initiated, the pipeline operates in the following order:
- Vertex Processing:
- Vertex Post-Processing, the outputs of the last stage are adjusted or shipped to different locations.
- Scan conversion and primitive parameter interpolation, which generates a number of Fragments.
- A Fragment Shader processes each fragment. Each fragment generates a number of outputs.
- Per-Sample_Processing, including but not limited to:
The process of vertex specification is where the application sets up an ordered list of vertices to send to the pipeline. These vertices define the boundaries of a primitive.
Primitives are basic drawing shapes, like triangles, lines, and points. Exactly how the list of vertices is interpreted as primitives is handled via a later stage.
This part of the pipeline deals with a number of objects like Vertex Array Objects and Vertex Buffer Objects. Vertex Array Objects define what data each vertex has, while Vertex Buffer Objects store the actual vertex data itself.
A vertex's data is a series of attributes. Each attribute is a small set of data that the next stage will do computations on. While a set of attributes do specify a vertex, there is nothing that says that part of a vertex's attribute set needs to be a position or normal. Attribute data is entirely arbitrary; the only meaning assigned to any of it happens in the vertex processing stage.
Once the vertex data is properly specified, it is then rendered as a Primitive via a drawing command.
Vertices fetched due to the prior vertex rendering stage begin their processing here. The vertex processing stages are almost all programmable operations. This allows user code to customize the way vertices are processed. Each stage represents a different kind of shader operation.
Many of these stages are optional.
Vertex shaders perform basic processing of each individual vertex. Vertex shaders receive the attribute inputs from the vertex rendering and converts each incoming vertex into a single outgoing vertex based on an arbitrary, user-defined program.
Vertex shaders can have user-defined outputs, but there is also a special output that represents the final position of the vertex. If there are no subsequent vertex processing stages, vertex shaders are expected to fill in this position with the clip-space position of the vertex, for rendering purposes.
One limitation on vertex processing is that each input vertex must map to a specific output vertex. And because vertex shader invocations cannot share state between them, the input attributes to output vertex data mapping is 1:1. That is, if you feed the exact same attributes to the same vertex shader in the same primitive, you will get the same output vertex data. This gives implementations the right to optimize vertex processing; if they can detect that they're about to process a previously processed vertex, they can use the previously processed data stored in a post-transform cache. Thus they do not have to run the vertex processing on that data again.
Vertex shaders are not optional.
|Core in version||4.6|
|Core since version||4.0|
|Core ARB extension||ARB_tessellation_shader|
Primitives can be tessellated using two shader stages and a fixed-function tessellator between them. The Tessellation Control Shader (TCS) stage comes first, and it determines the amount of tessellation to apply to a primitive, as well as ensuring connectivity between adjacent tessellated primitives. The Tessellation Evaluation Shader (TES) stage comes last, and it applies the interpolation or other operations used to compute user-defined data values for primitives generated by the fixed-function tessellation process.
Tessellation as a process is optional. Tessellation is considered active if a TES is active. The TCS is optional, but a TCS can only be used alongside a TES.
Geometry shaders are user-defined programs that process each incoming primitive, returning zero or more output primitives.
The input primitives for geometry shaders are the output primitives from a subset of the Primitive Assembly process. So if you send a triangle strip as a single primitive, what the geometry shader will see is a series of triangles.
However, there are a number of input primitive types that are defined specifically for geometry shaders. These adjacency primitives give GS's a larger view of the primitives; they provide access to vertices of primitives adjacent to the current one.
The output of a GS is zero or more simple primitives, much like the output of primitive assembly. The GS is able to remove primitives, or tessellate them by outputting many primitives for a single input. The GS can also tinker with the vertex values themselves, either doing some of the work for the vertex shader, or just to interpolate the values when tessellating them. Geometry shaders can even convert primitives to different types; input point primitives can become triangles, or lines can become points.
Geometry shaders are optional.
After the shader-based vertex processing, vertices undergo a number of fixed-function processing steps.
The outputs of the geometry shader or primitive assembly can be written to a series of buffer objects that have been setup for this purpose. This is called transform feedback mode; it allows the user to transform data via vertex and geometry shaders, then hold on to that data for use later.
The data output into the transform feedback buffer is the data from each primitive emitted by this step.
Primitive assembly is the process of collecting a run of vertex data output from the prior stages and composing it into a sequence of primitives. The type of primitive the user rendered with determines how this process works.
The output of this process is an ordered sequence of simple primitives (lines, points, or triangles). If the input is a triangle strip primitive containing 12 vertices, for example, the output of this process will be 10 triangles.
If tessellation or geometry shaders are active, then a limited form of primitive assembly is executed before these Vertex Processing stages. This is used to feed those particular shader stages with individual primitives, rather than a sequence of vertices.
The rendering pipeline can also be aborted at this stage. This allows the use of Transform Feedback operations, without having to actually render something.
The primitives are then clipped. Clipping means that primitives that lie on the boundary between the inside of the viewing volume and the outside are split into several primitives, such that the entire primitive lies in the volume. Also, the last Vertex Processing shader stage can specify user-defined clipping operations, on a per-vertex basis.
Triangle primitives can be culled (ie: discarded without rendering) based on the triangle's facing in window space. This allows you to avoid rendering triangles facing away from the viewer. For closed surfaces, such triangles would naturally be covered up by triangles facing the user, so there is never any need to render them. Face culling is a way to avoid rendering such primitives.
Primitives that reach this stage are then rasterized in the order in which they were given. The result of rasterizing a primitive is a sequence of Fragments.
A fragment is a set of state that is used to compute the final data for a pixel (or sample if multisampling is enabled) in the output framebuffer. The state for a fragment includes its position in screen-space, the sample coverage if multisampling is enabled, and a list of arbitrary data that was output from the previous vertex or geometry shader.
This last set of data is computed by interpolating between the data values in the vertices for the fragment. The style of interpolation is defined by the shader that outputed those values.
The data for each fragment from the rasterization stage is processed by a fragment shader. The output from a fragment shader is a list of colors for each of the color buffers being written to, a depth value, and a stencil value. Fragment shaders are not able to set the stencil data for a fragment, but they do have control over the color and depth values.
Fragment shaders are optional. If you render without a fragment shader, the depth (and stencil) values of the fragment get their usual values. But the value of all of the colors that a fragment could have are undefined. Rendering without a fragment shader is useful when rendering only a primitive's default depth information to the depth buffer, such as when performing Occlusion Query tests.
The fragment data output from the fragment processor is then passed through a sequence of steps.
The first step is a sequence of culling tests; if a test is active and the fragment fails the test, the underlying pixels/samples are not updated (usually). Many of these tests are only active if the user activates them. The tests are:
- Pixel ownership test: Fails if the fragment's pixel is not "owned" by OpenGL (if another window is overlapping with the GL window). Always passes when using a Framebuffer Object. Failure means that the pixel contains undefined values.
- Scissor Test: When enabled, the test fails if the fragment's pixel lies outside of a specified rectangle of the screen.
- Stencil Test: When enabled, the test fails if the stencil value provided by the test does not compare as the user specifies against the stencil value from the underlying sample in the stencil buffer. Note that the stencil value in the framebuffer can still be modified even if the stencil test fails (and even if the depth test fails).
- Depth Test: When enabled, the test fails if the fragment's depth does not compare as the user specifies against the depth value from the underlying sample in the depth buffer.
After this, color blending happens. For each fragment color value, there is a specific blending operation between it and the color already in the framebuffer at that location. Logical Operations may also take place in lieu of blending, which perform bitwise operations between the fragment colors and framebuffer colors.
Lastly, the fragment data is written to the framebuffer. Masking operations allow the user to prevent writes to certain values. Color, depth, and stencil writes can be masked on and off; individual color channels can be masked as well.