Difference between revisions of "Geometry Shader"

From OpenGL Wiki
Jump to navigation Jump to search
Line 200: Line 200:
* {{enum|GL_UNDEFINED_VERTEX}}: The implementation isn't saying.
* {{enum|GL_UNDEFINED_VERTEX}}: The implementation isn't saying.
For maximum portability, you will have to provide the same layer and viewport index to each primitive. So if you wanted to output a triangle strip, where different triangles had different indices, too bad. You have to split it into different primitives, or write a different GS for every platform and detect based on this get which to use.
For maximum portability, you will have to provide the same layer and viewport index to each primitive. So if you wanted to output a triangle strip, where different triangles had different indices, too bad. You have to split it into different primitives.
=== Output streams ===
=== Output streams ===

Revision as of 09:13, 31 January 2013

Geometry Shader
Core in version 4.6
Core since version 3.2
ARB extension ARB_geometry_shader4

A Geometry Shader (GS) is a Shader program written in GLSL that governs the processing of primitives. It happens after primitive assembly, as an additional optional step in that part of the pipeline. A GS can create new primitives, unlike vertex shaders, which are limited to a 1:1 input to output ratio. A GS can also do layered rendering; this means that the GS can specifically say that a primitive is to be rendered to a particular layer of the framebuffer.

A geometry shader is optional and does not have to be used.

Note: While geometry shaders have had previous extensions like GL_EXT_geometry_shader4 and GL_ARB_geometry_shader4, these extensions expose the API and GLSL functionality in very different ways from the core feature. This page describes only the core feature.


Geometry shaders sit between Vertex Shaders (or the optional Tessellation stage) and the fixed-function Vertex Post-Processing stage. Vertex shaders have a 1:1 ratio of vertices input to vertices output. Each vertex shader invocation gets one vertex in and writes one vertex out.

Geometry shader invocations take a single Primitive as input and may output zero or more primitives. There are implementation-defined limits on how many primitives can be generated from a single GS invocation.

While the GS can be used to amplify geometry, implementing a form of tessellation, this is not the primary use for the feature. The general uses for GS's are:

  • Layered rendering: taking one primitive and rendering it to multiple images without having to change bound rendertargets and so forth.
  • Transform Feedback: This is often employed for doing computational tasks on the GPU (obviously pre-Compute Shader).

In OpenGL 4.0, GS's gained two new features: the ability to write to multiple output streams. This is used exclusively with transform feedback, such that different feedback buffer sets can get different transform feedback data.

The other feature was GS instancing, which allows multiple invocations to operate over the same input primitive. This makes layered rendering easier to implement and possibly faster performing, as each layer's primitive(s) can be computed by a separate GS instance.

Primitive in/out specification

Each geometry shader is designed to accept a specific Primitive type as input and to output a specific primitive type. The accepted input primitive type is defined in the shader:

layout(input_primitive​) in;

The input_primitive​ type must match the primitive type used with the rendering command that renders with this shader program. The valid values for input_primitive​, along with the valid OpenGL primitive types, are:

GS input OpenGL primitives vertex count
points GL_POINTS 1

The vertex count is the number of vertices that the GS receives per-input primitive.

The output primitive type is defined as follows:

layout(output_primitive​, max_vertices = vert_count​) out;

The output_primitive​ may be one of the following:

  • points
  • line_strip
  • triangle_strip

These work exactly the same way their counterpart OpenGL rendering modes do. To output individual triangles or lines, simply use EndPrimitive (see below) after emitting each set of 3 or 2 vertices.

There must be a max_vertices declaration for the output. The number must be a compile-time constant, and it defines the maximum number of vertices that will be written by a single invocation of the GS. It may be no larger than the implementation-defined limit of MAX_GEOMETRY_OUTPUT_VERTICES. The minimum value for this limit is 256. See the limitations below.


GS Instancing
Core in version 4.0
Core ARB extension ARB_gpu_shader5

The GS can also be instanced (note that this is different from instanced rendering). This causes the GS to execute multiple times for the same primitive. This is useful for layered rendering and outputs to multiple streams (see below).

To use instancing, there must be an input layout qualifier:

layout(invocations = num_instances​) in;

The value of num_instances​ is a compile-time constant, and must not be larger than MAX_GEOMETRY_SHADER_INVOCATIONS (the minimum implementations will allow is 32). The built-in value gl_InvocationID specifies the particular instance of this shader.

The output primitives from instances are ordered by the gl_InvocationID. So 2 primitives written from 3 instances will create a primitive stream of: (prim0, inst0), (prim0, inst1), (prim0, inst2), (prim1, inst0), ...


Geometry shaders take a primitive as input; each primitive is composed of some number of vertices.

The outputs of the vertex shader (or Tessellation Stage, as appropriate) are thus fed to the GS as arrays of variables. These can be organized as individual values or as part of an interface block.

Geometry shader inputs may have interpolation qualifiers on them.

The predefined outputs from the prior pipeline stage are already arranged in an interface block.

in gl_PerVertex
  vec4 gl_Position;
  float gl_PointSize;
  float gl_ClipDistance[];
} gl_in[];

The length of gl_in[] corresponds to the input primitive count. All input arrays will have the same length.

The order of vertices in input arrays corresponds to the order of the vertices specified by prior shader stages.

Primitive inputs

There are some GS input values that are based on primitives, not vertices. These are:

in int gl_PrimitiveIDIn;
in int gl_InvocationID;  //Requires GLSL 4.0 or ARB_gpu_shader5

gl_PrimitiveIDIn is the current input primitive's ID, based on the number of primitives processed by this shader since the current rendering command started. gl_InvocationID is the current instance, as defined when instancing geometry shaders.


Geometry shaders can output as many vertices as they wish (up to the maximum specified by the max_vertices layout specification). To provide this, output values in geomtry shaders are not arrays. Instead, a function-based interface is used.

GS code writes all of the output values for a vertex, then calls EmitVertex(). This tells the system to write those output values to where ever it is that output vertices get written. After calling this function, all output variables contain undefined values.

The GS defines what kind of primitive these vertex outputs represent. The GS can also end a primitive and start a new one, by calling the EndPrimitive() function. This does not emit a vertex.

In order to write two independent triangles from a GS, you must write three separate vertices with EmitVertex() for the first three vertices, then call EndPrimitive() to end the strip and start a new one. Then you write three more vertices with EmitVertex().

Output variables are defined as normal for GLSL. They can be grouped into interface blocks or be single values, as appropriate.

Many of the predefined outputs are grouped into an interface block called gl_PerVertex:

out gl_PerVertex
  vec4 gl_Position;
  float gl_PointSize;
  float gl_ClipDistance[];

Output variables can be defined with interpolation qualifiers. The Fragment Shader equivalent interface variables should define the same variables with the same qualifiers.

Certain predefined outputs have special meaning and semantics.

out int gl_PrimitiveID;

The primitive ID will be passed to the fragment shader. The primitive ID for a particular line/triangle will be taken from the provoking vertex of that line/triangle, so make sure that you are writing the correct value for the right provoking vertex.

The meaning for this value is whatever you want it to be. However, if you want to match the standard OpenGL meaning (ie: what the Fragment Shader would get if no GS were used), just do this for each vertex before emitting it.:

gl_PrimitiveID = gl_PrimitiveIDIn;

Layered rendering

Layered rendering is the process of having the GS send specific primitives to different layers of a layered framebuffer. This can be useful for doing cube-based shadow mapping, or even for rendering cube environment maps without having to render the entire scene multiple times.

Layered rendering in the GS works via two special output variables:

out int gl_Layer;
out int gl_ViewportIndex; //Requires GL 4.1 or ARB_viewport_array.

The gl_Layer output defines which layer in the layered image the primitive goes to. Each vertex in the primitive must get the same layer index. Note that when rendering to cubemap arrays, the gl_Layer value represents layer-faces (the faces within a layer), not the layers of cubemaps.

gl_ViewportIndex, which requires GL 4.1 or ARB_viewport_array, specifies which viewport index to use with this primitive.

Note: ARB_viewport_array, while technically a 4.1 feature, is widely available on 3.3 hardware, from both NVIDIA and AMD.

Layered rendering can be more efficient with GS instancing, as different GS invocations can process instances in parallel. However, while ARB_viewport_array is often implemented in 3.3 hardware, no 3.3 hardware provides ARB_gpu_shader5 support.

Which vertex

gl_Layer and gl_ViewportIndex are per-vertex parameters, but they specify a property that applies to the entire primitive. Therefore, a question arises: which vertex in a particular primitive defines that primitive's layer and viewport index?

The answer is that it is implementation-dependent. However, OpenGL does have two queries to determine which one the current implementation uses: GL_LAYER_PROVOKING_VERTEX and GL_VIEWPORT_INDEX_PROVOKING_VERTEX.

The value returned from glGetIntegerv will be one of the following enumerators:

  • GL_PROVOKING_VERTEX: The vertex used will track the current provoking vertex convention.
  • GL_LAST_VERTEX_CONVENTION​: The vertex used will be the one defined by the last-vertex provoking vertex convention.
  • GL_FIRST_VERTEX_CONVENTION​: The vertex used will be the one defined by the first-vertex provoking vertex convention.
  • GL_UNDEFINED_VERTEX: The implementation isn't saying.

For maximum portability, you will have to provide the same layer and viewport index to each primitive. So if you wanted to output a triangle strip, where different triangles had different indices, too bad. You have to split it into different primitives.

Output streams

Output streams
Core in version 4.6
Core ARB extension ARB_transform_feedback3

When using Transform Feedback to compute values, it is often useful to be able to send different sets of vertices to different buffers at different rates. For example, GS's can send vertex data to one stream, while building per-instance data in another stream. The vertex data and per-instance data will be of different lengths, written at different speeds.

Multiple stream output requires that the output primitive type be points. You can still take whatever input you prefer.

To provide this, output variables can be given a stream index with a layout qualifier:

layout(stream = stream_index​) out vec4 some_output;

The stream_index​ ranges from 0 to GL_MAX_VERTEX_STREAMS - 1.

A default value for the stream can be set with:

layout(stream = 2) out;

All following out variables will use stream 2 unless they specify a stream. The default can be changed later. The initial default is 0.

To write a vertex to a particular stream, the function EmitStreamVertex is used. This function takes a stream index; only those output variables are written. Similarly, EndStreamPrimitive ends a particular stream's primitive. However, since multiple stream output requires using points primitives, the latter function is not very useful.

Only values passed to stream 0 will actually be rendered; the rest of the streams will only matter if transform feedback is being used. Calling EmitVertex or EndPrimitive is equivalent to calling their stream counterparts with stream 0.

Output limitations

There are two competing limitations on the output of a geometry shader:

  1. The maximum number of vertices that a single invocation of a GS can output.
  2. The total maximum number of output components that a single invocation of a GS can output.

The first limit, defined by GL_MAX_GEOMETRY_OUTPUT_VERTICES, is the maximum number that can be provided to the max_vertices​ output layout qualifier. No single geometry shader invocation can exceed this number.

The other limit, defined by GL_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS is, in layman's terms, the total amount of stuff that a single GS invocation can write. It is the total number of output values (a component, in GLSL terms, is a component of a vector. So a float is one component; a vec3 is 3 components) that a single GS invocation can write to. This is different from GL_MAX_GEOMETRY_OUTPUT_COMPONENTS (the maximum allowed number of components in out variables). The total output component is the total number of components + vertices that can be written.

For example, if the total output component count is 1024 (the smallest maximum value from GL 4.3), and the output stream writes to 12 components, the total number of vertices that can be written is 1024/12 = 85. This is the absolute hard limit to the number of vertices that can be written; even if GL_MAX_GEOMETRY_OUTPUT_VERTICES is larger than 85, because each vertex takes up 12 components, the true maximum that this particular geometry shader can write is 85 vertices.

See also