Tessellation Control Shader

From OpenGL Wiki
Jump to navigation Jump to search
Tessellation Control Shader
Core in version 4.6
Core since version 4.0
Core ARB extension ARB_tessellation_shader

The Tessellation Control Shader (TCS) is a Shader program written in GLSL. It sits between the Vertex Shader and the Tessellation Evaluation Shader.

The TCS controls how much tessellation a particular patch gets; it also defines the size of a patch, thus allowing it to augment data. It can also filter vertex data taken from the vertex shader. The main purpose of the TCS is to feed the tessellation levels to the Tessellation primitive generator stage, as well as to feed patch data (as its output values) to the Tessellation Evaluation Shader stage.

Execution model

The TCS execution model is different from most other shader stages; it is most similar to Compute Shaders. Unlike Geometry Shaders, where each invocation can output multiple primitives, each TCS invocation is only responsible for producing a single vertex of output to the output patch.

For each patch provided during rendering, n​ TCS shader invocations will be processed, where n​ is the number of vertices in the output patch. So if a drawing command draws 20 patches, and each output patch has 4 vertices, there will be a total of 80 separate TCS invocations.

The different invocations that provide data to the same patch are interconnected. These invocations all share their output values. They can read output values that other invocations for the same patch have written to. But in order to do so, they must use a synchronization mechanism to ensure that all other invocations for the patch have executed at least that far.

Because of this, it is possible for TCS invocations to share data and communicate with one another.

Output patch size

The output patch size is the number of vertices in the output patch. It also determines the number of TCS invocations used to compute this patch data. The output patch size does not have to match the input patch size.

The number of vertices in the output patch is defined with an output layout qualifier:

layout(vertices = patch_size​) out;

patch_size​ must be an integral constant expression greater than zero and less than the patch limit (see below).


All inputs from vertex shaders to the TCS are aggregated into arrays, based on the size of the input patch. The size of these arrays is the number of input vertices provided by the patch primitive.

Every TCS invocation for an input patch has access to the same input data, save for gl_InvocationID (see below) which will be different for each invocation. So any TCS invocation can look at the input vertex data for the entire input patch.

User-defined inputs can be declared as unbounded arrays:

in vec2 texCoord[];

You should not attempt to index this array past the number of input patch vertices.

TCS inputs may have interpolation qualifiers on them. They have no actual function however.

V · E

Tessellation Control Shaders provide the following built-in input variables:

 in int gl_PatchVerticesIn;
 in int gl_PrimitiveID;
 in int gl_InvocationID;
the number of vertices in the input patch.
the index of the current patch within this rendering command.
the index of the TCS invocation within this patch. A TCS invocation writes to per-vertex output variables by using this to index them.

The TCS also takes the built-in variables output by the vertex shader:

in gl_PerVertex
  vec4 gl_Position;
  float gl_PointSize;
  float gl_ClipDistance[];
} gl_in[gl_MaxPatchVertices];

Note that just because gl_in is defined to have gl_MaxPatchVertices entries does not mean that you can access beyond gl_PatchVerticesIn and get reasonable values. These variables have only the meaning the vertex shader that passed them gave them.


TCS output variables are passed directly to the Tessellation Evaluation Shader, without any form of interpolation (that's the TES's main job). These can be per-vertex outputs and per-patch outputs.

Per-vertex outputs are aggregated into arrays. Therefore, a user-defined per-vertex output variable would be defined as such:

out vec2 vertexTexCoord[];

The length of the array (vertexTexCoord.length() will always be the size of the output patch. So you don't need to restate it in the definition.

A TCS can only ever write to the per-vertex output variable that corresponds to their invocation. So writes to per-vertex outputs must be of the form vertexTexCoord[gl_InvocationID]. Any expression that writes to a per-vertex output that doesn't index it with exactly "gl_InvocationID" results in a compile-time error. Silly things like vertexTexCoord[gl_InvocationID - 1 + 1] will also error.

Patch variables

Per-patch output variables are not aggregated into arrays (unless you want them to be, in which case you must specify a size). All TCS invocations for this patch see the same patch variables. They are declared with the patch keyword:

patch out vec4 data;

Any TCS invocation can write to a per-patch output; indeed, all TCS invocations will generally write to a per-patch output. As long as they all write the same value, everything is fine.

Built-in outputs

V · E

Tessellation Control Shaders have the following built-in patch output variables:

patch out float gl_TessLevelOuter[4];
patch out float gl_TessLevelInner[2];

These define the outer and inner tessellation levels used by the tessellation primitive generator. They define how much tessellation to apply to the patch. Their exact meaning depends on the type of patch (and other settings) defined in the Tessellation Evaluation Shader.

Note: If any of the outer levels used by the abstract patch type is 0 or negative (or NaN), then the patch will be discarded by the generator, and no TES invocations for this patch will result.

As with any other patch variable, multiple TCS invocations for the same patch can write to the same tessellation level variable, so long as they are all computing and writing the exact same value.

TCS's also provide the following optional per-vertex output variables:

out gl_PerVertex
  vec4 gl_Position;
  float gl_PointSize;
  float gl_ClipDistance[];
} gl_out[];

The use of any of these in a TCS is completely optional. Indeed, their semantics will generally be of no practical value to the TCS. They have the same general meaning as for vertex shaders, but since a TCS must always be followed by an evaluation shader, the TCS never has to write to any of them.


TCS invocations that operate on the same patch can read each others output variables, whether per-patch or per-vertex. To do so, they must first ensure that those invocations have actually written to those variables. The value of all output variables is undefined initially.

Ensuring that invocations have written to a variable requires synchronization between invocations. This is done via the barrier() function. When executed, it will not complete until all other TCS invocations for this patch have reached that barrier. This means that all writes have occurred by this point. However, subsequent writes to those variables may have occurred, so if you want to read those variables, make sure that another barrier() is issued before writing more to them. If there are no subsequent writes to those variables, then this should be fine.

The barrier() function has significant restrictions on where it can be placed. It must be placed:

  • Directly in the main() function. It cannot be in any other functions or subroutines.
  • Outside of any flow control. This includes if, for, switch, and the like.
  • Before any use of return, even a conditional one.

This ensures that every TCS invocation hits the same sequence of barrier() calls in the same order every time. The compiler will error if any of these restrictions are violated.

Note: This is different from the restrictions on barrier() in Compute Shaders. Also, writes to TCS output variables do not use the rules for Incoherent Memory Access, so they do not need those memory barrier calls.


There is a maximum output patch size, defined by GL_MAX_PATCH_VERTICES; the vertices output qualifier must be less than this value. The minimum required limit is 32.

There are other limitations on output size, however. The number of components for active per-vertex output variables may not exceed GL_MAX_TESS_CONTROL_OUTPUT_COMPONENTS. The minimum required limit is 128.

The number of components for active per-patch output variables may not exceed GL_MAX_TESS_PATCH_COMPONENTS. The minimum required limit is 120. Note that the gl_TessLevelOuter and gl_TessLevelInner outputs do not count against this limit (but other built-in outputs do if you use them.

There is a limit on the total number of components that can go into an output patch. To compute the total number of components, multiply the number of active per-vertex components by the number of output vertices, then add the number of active per-patch components. This number may not exceed GL_MAX_TESS_CONTROL_TOTAL_OUTPUT_COMPONENTS. The minimum required limit is 4096, which is not quite enough to use a 32-vertex patch with 128 per-vertex components and 120 per-patch components. But it's still a lot.

See also