30. Device-Generated Commands

This chapter discusses the generation of command buffer content on the device, for which these principle steps are to be taken:

vkCmdPreprocessGeneratedCommandsNV executes in a separate logical pipeline from either graphics or compute. When preprocessing commands in a separate step they must be explicitly synchronized against the command execution. When not preprocessing, the preprocessing is automatically synchronized against the command execution.

30.1. Indirect Commands Layout

The device-side command generation happens through an iterative processing of an atomic sequence comprised of command tokens, which are represented by:

VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkIndirectCommandsLayoutNV)

30.1.1. Creation and Deletion

Indirect command layouts are created by:

VkResult vkCreateIndirectCommandsLayoutNV(
    VkDevice                                    device,
    const VkIndirectCommandsLayoutCreateInfoNV* pCreateInfo,
    const VkAllocationCallbacks*                pAllocator,
    VkIndirectCommandsLayoutNV*                 pIndirectCommandsLayout);
  • device is the logical device that creates the indirect command layout.

  • pCreateInfo is a pointer to an instance of the VkIndirectCommandsLayoutCreateInfoNV structure containing parameters affecting creation of the indirect command layout.

  • pAllocator controls host memory allocation as described in the Memory Allocation chapter.

  • pIndirectCommandsLayout points to a VkIndirectCommandsLayoutNV handle in which the resulting indirect command layout is returned.

Valid Usage (Implicit)
Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

The VkIndirectCommandsLayoutCreateInfoNV structure is defined as:

typedef struct VkIndirectCommandsLayoutCreateInfoNV {
    VkStructureType                           sType;
    const void*                               pNext;
    VkIndirectCommandsLayoutUsageFlagsNV      flags;
    VkPipelineBindPoint                       pipelineBindPoint;
    uint32_t                                  tokenCount;
    const VkIndirectCommandsLayoutTokenNV*    pTokens;
    uint32_t                                  streamCount;
    const uint32_t*                           pStreamStrides;
} VkIndirectCommandsLayoutCreateInfoNV;
  • sType is the type of this structure.

  • pNext is NULL or a pointer to an extension-specific structure.

  • pipelineBindPoint is the VkPipelineBindPoint that this layout targets.

  • flags is a bitmask of VkIndirectCommandsLayoutUsageFlagBitsNV specifying usage hints of this layout.

  • tokenCount is the length of the individual command sequence.

  • pTokens is an array describing each command token in detail. See VkIndirectCommandsTokenTypeNV and VkIndirectCommandsLayoutTokenNV below for details.

  • streamCount is the number of streams used to provide the token inputs.

  • pStreamStrides is an array defining the byte stride for each input stream.

The following code illustrates some of the flags:

void cmdProcessAllSequences(cmd, pipeline, indirectCommandsLayout, pIndirectCommandsTokens, sequencesCount, indexbuffer, indexbufferOffset)
{
  for (s = 0; s < sequencesCount; s++)
  {
    sUsed = s;

    if (indirectCommandsLayout.flags & VK_INDIRECT_COMMANDS_LAYOUT_USAGE_INDEXED_SEQUENCES_BIT_NV) {
      sUsed = indexbuffer.load_uint32( sUsed * sizeof(uint32_t) + indexbufferOffset);
    }

    if (indirectCommandsLayout.flags & VK_INDIRECT_COMMANDS_LAYOUT_USAGE_UNORDERED_SEQUENCES_BIT_NV) {
      sUsed = incoherent_implementation_dependent_permutation[ sUsed ];
    }

    cmdProcessSequence( cmd, pipeline, indirectCommandsLayout, pIndirectCommandsTokens, sUsed );
  }
}
Valid Usage
  • The pipelineBindPoint must be VK_PIPELINE_BIND_POINT_GRAPHICS

  • tokenCount must be greater than 0 and less than or equal to VkPhysicalDeviceDeviceGeneratedCommandsPropertiesNV::maxIndirectCommandsTokenCount

  • If pTokens contains an entry of VK_INDIRECT_COMMANDS_TOKEN_TYPE_SHADER_GROUP_NV it must be the first element of the array and there must be only a single element of such token type.

  • If pTokens contains an entry of VK_INDIRECT_COMMANDS_TOKEN_TYPE_STATE_FLAGS_NV there must be only a single element of such token type.

  • All state tokens in pTokens must occur prior work provoking tokens (VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_NV, VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_INDEXED_NV, VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_TASKS_NV).

  • The content of pTokens must include one single work provoking token that is compatible with the pipelineBindPoint.

  • streamCount must be greater than 0 and less or equal to VkPhysicalDeviceDeviceGeneratedCommandsPropertiesNV::maxIndirectCommandsStreamCount

  • each element of pStreamStrides must be greater than `0`and less than or equal to VkPhysicalDeviceDeviceGeneratedCommandsPropertiesNV::maxIndirectCommandsStreamStride. Furthermore the alignment of each token input must be ensured.

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_INDIRECT_COMMANDS_LAYOUT_CREATE_INFO_NV

  • pNext must be NULL

  • flags must be a valid combination of VkIndirectCommandsLayoutUsageFlagBitsNV values

  • flags must not be 0

  • pipelineBindPoint must be a valid VkPipelineBindPoint value

  • pTokens must be a valid pointer to an array of tokenCount valid VkIndirectCommandsLayoutTokenNV structures

  • pStreamStrides must be a valid pointer to an array of streamCount uint32_t values

  • tokenCount must be greater than 0

  • streamCount must be greater than 0

Bits which can be set in VkIndirectCommandsLayoutCreateInfoNV::flags, specifying usage hints of an indirect command layout, are:

typedef enum VkIndirectCommandsLayoutUsageFlagBitsNV {
    VK_INDIRECT_COMMANDS_LAYOUT_USAGE_EXPLICIT_PREPROCESS_BIT_NV = 0x00000001,
    VK_INDIRECT_COMMANDS_LAYOUT_USAGE_INDEXED_SEQUENCES_BIT_NV = 0x00000002,
    VK_INDIRECT_COMMANDS_LAYOUT_USAGE_UNORDERED_SEQUENCES_BIT_NV = 0x00000004,
    VK_INDIRECT_COMMANDS_LAYOUT_USAGE_FLAG_BITS_MAX_ENUM_NV = 0x7FFFFFFF
} VkIndirectCommandsLayoutUsageFlagBitsNV;
  • VK_INDIRECT_COMMANDS_LAYOUT_USAGE_EXPLICIT_PREPROCESS_BIT_NV specifies that the layout is always used with the manual preprocessing step through calling vkCmdPreprocessGeneratedCommandsNV and executed by vkCmdExecuteGeneratedCommandsNV with isPreprocessed set to VK_TRUE.

  • VK_INDIRECT_COMMANDS_LAYOUT_USAGE_INDEXED_SEQUENCES_BIT_NV specifies that the input data for the sequences is not implicitly indexed from 0..sequencesUsed but a user provided VkBuffer encoding the index is provided.

  • VK_INDIRECT_COMMANDS_LAYOUT_USAGE_UNORDERED_SEQUENCES_BIT_NV specifies that the processing of sequences can happen at an implementation-dependent order, which is not: guaranteed to be coherent using the same input data.

typedef VkFlags VkIndirectCommandsLayoutUsageFlagsNV;

VkIndirectCommandsLayoutUsageFlagsNV is a bitmask type for setting a mask of zero or more VkIndirectCommandsLayoutUsageFlagBitsNV.

Indirect command layouts are destroyed by:

void vkDestroyIndirectCommandsLayoutNV(
    VkDevice                                    device,
    VkIndirectCommandsLayoutNV                  indirectCommandsLayout,
    const VkAllocationCallbacks*                pAllocator);
  • device is the logical device that destroys the layout.

  • indirectCommandsLayout is the layout to destroy.

  • pAllocator controls host memory allocation as described in the Memory Allocation chapter.

Valid Usage
  • All submitted commands that refer to indirectCommandsLayout must have completed execution

  • If VkAllocationCallbacks were provided when indirectCommandsLayout was created, a compatible set of callbacks must be provided here

  • If no VkAllocationCallbacks were provided when indirectCommandsLayout was created, pAllocator must be NULL

  • The VkPhysicalDeviceDeviceGeneratedCommandsFeaturesNVdeviceGeneratedCommands feature must be enabled

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • indirectCommandsLayout must be a valid VkIndirectCommandsLayoutNV handle

  • If pAllocator is not NULL, pAllocator must be a valid pointer to a valid VkAllocationCallbacks structure

  • indirectCommandsLayout must have been created, allocated, or retrieved from device

30.1.2. Token Input Streams

The VkIndirectCommandsStreamNV structure specifies the input data for one or more tokens at processing time.

typedef struct VkIndirectCommandsStreamNV {
    VkBuffer        buffer;
    VkDeviceSize    offset;
} VkIndirectCommandsStreamNV;
  • buffer specifies the VkBuffer storing the functional arguments for each sequence. These arguments can be written by the device.

  • offset specified an offset into buffer where the arguments start.

Valid Usage
  • The buffer’s usage flag must have the VK_BUFFER_USAGE_INDIRECT_BUFFER_BIT bit set.

  • The offset must be aligned to VkPhysicalDeviceDeviceGeneratedCommandsPropertiesNV::minIndirectCommandsBufferOffsetAlignment.

Valid Usage (Implicit)
  • buffer must be a valid VkBuffer handle

The input streams can contain raw uint32_t values, existing indirect commands such as:

or additional commands as listed below. How the data is used is described in the next section.

The VkBindShaderGroupIndirectCommandNV structure specifies the input data for the VK_INDIRECT_COMMANDS_TOKEN_TYPE_SHADER_GROUP_NV token.

typedef struct VkBindShaderGroupIndirectCommandNV {
    uint32_t    groupIndex;
} VkBindShaderGroupIndirectCommandNV;
  • index specifies which shader group of the current bound graphics pipeline is used.

Valid Usage
  • The current bound graphics pipeline, as well as the pipelines it may reference, must have been created with VK_PIPELINE_CREATE_INDIRECT_BINDABLE_BIT_NV

  • The index must be within range of the accessible shader groups of the current bound graphics pipeline. See vkCmdBindPipelineShaderGroupNV for further details.

The VkBindIndexBufferIndirectCommandNV structure specifies the input data for the VK_INDIRECT_COMMANDS_TOKEN_TYPE_INDEX_BUFFER_NV token.

typedef struct VkBindIndexBufferIndirectCommandNV {
    VkDeviceAddress    bufferAddress;
    uint32_t           size;
    VkIndexType        indexType;
} VkBindIndexBufferIndirectCommandNV;
  • bufferAddress specifies a physical address of the VkBuffer used as index buffer.

  • size is the byte size range which is available for this operation from the provided address.

  • indexType is a VkIndexType value specifying how indices are treated. Instead of the Vulkan enum values, a custom uint32_t value can be mapped to an VkIndexType by specifying the VkIndirectCommandsLayoutTokenNV::pIndexTypes and VkIndirectCommandsLayoutTokenNV::pIndexTypeValues arrays.

Valid Usage
  • The buffer’s usage flag from which the address was acquired must have the VK_BUFFER_USAGE_INDEX_BUFFER_BIT bit set.

  • The bufferAddress must be aligned to the indexType used.

  • Each element of the buffer from which the address was acquired and that is non-sparse must be bound completely and contiguously to a single VkDeviceMemory object

Valid Usage (Implicit)

The VkBindVertexBufferIndirectCommandNV structure specifies the input data for the VK_INDIRECT_COMMANDS_TOKEN_TYPE_VERTEX_BUFFER_NV token.

typedef struct VkBindVertexBufferIndirectCommandNV {
    VkDeviceAddress    bufferAddress;
    uint32_t           size;
    uint32_t           stride;
} VkBindVertexBufferIndirectCommandNV;
  • bufferAddress specifies a physical address of the VkBuffer used as vertex input binding.

  • size is the byte size range which is available for this operation from the provided address.

  • stride is the byte size stride for this vertex input binding as in VkVertexInputBindingDescription::stride. It is only used if VkIndirectCommandsLayoutTokenNV::vertexDynamicStride was set, otherwise the stride is inherited from the current bound graphics pipeline.

Valid Usage
  • The buffer’s usage flag from which the address was acquired must have the VK_BUFFER_USAGE_VERTEX_BUFFER_BIT bit set.

  • Each element of the buffer from which the address was acquired and that is non-sparse must be bound completely and contiguously to a single VkDeviceMemory object

The VkSetStateFlagsIndirectCommandNV structure specifies the input data for the VK_INDIRECT_COMMANDS_TOKEN_TYPE_STATE_FLAGS_NV token. Which state is changed depends on the VkIndirectStateFlagBitsNV specified at VkIndirectCommandsLayoutNV creation time.

typedef struct VkSetStateFlagsIndirectCommandNV {
    uint32_t    data;
} VkSetStateFlagsIndirectCommandNV;
  • data encodes packed state that this command alters.

    • Bit 0: If set represents VK_FRONT_FACE_CLOCKWISE, otherwise VK_FRONT_FACE_COUNTER_CLOCKWISE

A subset of the graphics pipeline state can be altered using indirect state flags:

typedef enum VkIndirectStateFlagBitsNV {
    VK_INDIRECT_STATE_FLAG_FRONTFACE_BIT_NV = 0x00000001,
    VK_INDIRECT_STATE_FLAG_BITS_MAX_ENUM_NV = 0x7FFFFFFF
} VkIndirectStateFlagBitsNV;
  • VK_INDIRECT_STATE_FLAG_FRONTFACE_BIT_NV allows to toggle the VkFrontFace rasterization state for subsequent draw operations.

typedef VkFlags VkIndirectStateFlagsNV;

VkIndirectStateFlagsNV is a bitmask type for setting a mask of zero or more VkIndirectStateFlagBitsNV.

30.1.3. Tokenized Command Processing

The processing is in principle illustrated below:

void cmdProcessSequence(cmd, pipeline, indirectCommandsLayout, pIndirectCommandsStreams, s)
{
  for (t = 0; t < indirectCommandsLayout.tokenCount; t++)
  {
    uint32_t stream  = indirectCommandsLayout.pTokens[t].stream;
    uint32_t offset  = indirectCommandsLayout.pTokens[t].offset;
    uint32_t stride  = indirectCommandsLayout.pStreamStrides[stream];
    stream            = pIndirectCommandsStreams[stream];
    const void* input = stream.buffer.pointer( stream.offset + stride * s + offset )

    // further details later
    indirectCommandsLayout.pTokens[t].command (cmd, pipeline, input, s);
  }
}

void cmdProcessAllSequences(cmd, pipeline, indirectCommandsLayout, pIndirectCommandsStreams, sequencesCount)
{
  for (s = 0; s < sequencesCount; s++)
  {
    cmdProcessSequence(cmd, pipeline, indirectCommandsLayout, pIndirectCommandsStreams, s);
  }
}

The processing of each sequence is considered stateless, therefore all state changes must occur prior work provoking commands within the sequence. A single sequence is strictly targeting the VkPipelineBindPoint it was created with.

The primary input data for each token is provided through VkBuffer content at preprocessing using vkCmdPreprocessGeneratedCommandsNV or execution time using using vkCmdExecuteGeneratedCommandsNV, however some functional arguments, for example binding sets, are specified at layout creation time. The input size is different for each token.

Possible values of those elements of the VkIndirectCommandsLayoutCreateInfoNV::pTokens array which specify command tokens (other elements of the array specify command parameters) are:

typedef enum VkIndirectCommandsTokenTypeNV {
    VK_INDIRECT_COMMANDS_TOKEN_TYPE_SHADER_GROUP_NV = 0,
    VK_INDIRECT_COMMANDS_TOKEN_TYPE_STATE_FLAGS_NV = 1,
    VK_INDIRECT_COMMANDS_TOKEN_TYPE_INDEX_BUFFER_NV = 2,
    VK_INDIRECT_COMMANDS_TOKEN_TYPE_VERTEX_BUFFER_NV = 3,
    VK_INDIRECT_COMMANDS_TOKEN_TYPE_PUSH_CONSTANT_NV = 4,
    VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_INDEXED_NV = 5,
    VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_NV = 6,
    VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_TASKS_NV = 7,
    VK_INDIRECT_COMMANDS_TOKEN_TYPE_MAX_ENUM_NV = 0x7FFFFFFF
} VkIndirectCommandsTokenTypeNV;
Table 39. Supported indirect command tokens
Token type Equivalent command

VK_INDIRECT_COMMANDS_TOKEN_TYPE_SHADER_GROUP_NV

vkCmdBindPipelineShaderGroupNV

VK_INDIRECT_COMMANDS_TOKEN_TYPE_STATE_FLAGS_NV

-

VK_INDIRECT_COMMANDS_TOKEN_TYPE_INDEX_BUFFER_NV

vkCmdBindIndexBuffer

VK_INDIRECT_COMMANDS_TOKEN_TYPE_VERTEX_BUFFER_NV

vkCmdBindVertexBuffers

VK_INDIRECT_COMMANDS_TOKEN_TYPE_PUSH_CONSTANT_NV

vkCmdPushConstants

VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_INDEXED_NV

vkCmdDrawIndexedIndirect

VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_NV

vkCmdDrawIndirect

VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_TASKS_NV

vkCmdDrawMeshTasksIndirectNV

The VkIndirectCommandsLayoutTokenNV structure specifies details to the function arguments that need to be known at layout creation time:

typedef struct VkIndirectCommandsLayoutTokenNV {
    VkStructureType                  sType;
    const void*                      pNext;
    VkIndirectCommandsTokenTypeNV    tokenType;
    uint32_t                         stream;
    uint32_t                         offset;
    uint32_t                         vertexBindingUnit;
    VkBool32                         vertexDynamicStride;
    VkPipelineLayout                 pushconstantPipelineLayout;
    VkShaderStageFlags               pushconstantShaderStageFlags;
    uint32_t                         pushconstantOffset;
    uint32_t                         pushconstantSize;
    VkIndirectStateFlagsNV           indirectStateFlags;
    uint32_t                         indexTypeCount;
    const VkIndexType*               pIndexTypes;
    const uint32_t*                  pIndexTypeValues;
} VkIndirectCommandsLayoutTokenNV;
  • sType is the type of this structure.

  • pNext is NULL or a pointer to an extension-specific structure.

  • tokenType specifies the token command type.

  • stream is the index of the input stream that contains the token argument data.

  • offset is a relative starting offset within the input stream memory for the token argument data.

  • vertexBindingUnit is used for the vertex buffer binding command.

  • vertexDynamicStride sets if the vertex buffer stride is provided by the binding command rather than the current bound graphics pipeline state.

  • pushconstantPipelineLayout is the VkPipelineLayout used for the push constant command.

  • pushconstantShaderStageFlags are the shader stage flags used for the push constant command.

  • pushconstantOffset is the offset used for the push constant command.

  • pushconstantSize is the size used for the push constant command.

  • indirectStateFlags are the active states for the state flag command.

  • indexTypeCount is the optional size of the pIndexTypes and pIndexTypeValues array pairings. If not zero, it allows to register a custom uint32_t value to be treated as specific VkIndexType.

  • pIndexTypes is the used VkIndexType for the corresponding uint32_t value entry in pname::pIndexTypeValues.

Valid Usage
  • stream must be smaller than VkIndirectCommandsLayoutCreateInfoNV::streamCount.

  • offset must be less than or equal to VkPhysicalDeviceDeviceGeneratedCommandsPropertiesNV::maxIndirectCommandsTokenOffset.

  • vertexBindingUnit must stay within device supported limits for the appropriate commands.

  • pushconstantOffset must stay within device supported limits for the appropriate commands.

  • pushconstantSize must stay within device supported limits for the appropriate commands.

  • indirectStateFlags must not be ´0´.

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_INDIRECT_COMMANDS_LAYOUT_TOKEN_NV

  • pNext must be NULL

  • tokenType must be a valid VkIndirectCommandsTokenTypeNV value

  • If pushconstantPipelineLayout is not VK_NULL_HANDLE, pushconstantPipelineLayout must be a valid VkPipelineLayout handle

  • pushconstantShaderStageFlags must be a valid combination of VkShaderStageFlagBits values

  • indirectStateFlags must be a valid combination of VkIndirectStateFlagBitsNV values

  • If indexTypeCount is not 0, pIndexTypes must be a valid pointer to an array of indexTypeCount valid VkIndexType values

  • If indexTypeCount is not 0, pIndexTypeValues must be a valid pointer to an array of indexTypeCount uint32_t values

The following code provides detailed information on how an individual sequence is processed. For valid usage, all restrictions from the regular commands apply.

void cmdProcessSequence(cmd, pipeline, indirectCommandsLayout, pIndirectCommandsStreams, s)
{
  for (uint32_t t = 0; t < indirectCommandsLayout.tokenCount; t++){
    token = indirectCommandsLayout.pTokens[t];

    uint32_t stride   = indirectCommandsLayout.pStreamStrides[token.stream];
    stream            = pIndirectCommandsStreams[token.stream];
    uint32_t offset   = stream.offset + stride * s + token.offset;
    const void* input = stream.buffer.pointer( offset )

    switch(input.type){
    VK_INDIRECT_COMMANDS_TOKEN_TYPE_SHADER_GROUP_NV:
      VkBindShaderGroupIndirectCommandNV* bind = input;

      vkCmdBindPipelineShaderGroupNV(cmd, indirectCommandsLayout.pipelineBindPoint,
        pipeline, bind->groupIndex);
    break;

    VK_INDIRECT_COMMANDS_TOKEN_TYPE_STATE_FLAGS_NV:
      VkSetStateFlagsIndirectCommandNV* state = input;

      if (token.indirectStateFlags & VK_INDIRECT_STATE_FLAG_FRONTFACE_BIT_NV){
        if (state.data & (1 << 0)){
          set VK_FRONT_FACE_CLOCKWISE;
        } else {
          set VK_FRONT_FACE_COUNTER_CLOCKWISE;
        }
      }
    break;

    VK_INDIRECT_COMMANDS_TOKEN_TYPE_PUSH_CONSTANT_NV:
      uint32_t* data = input;

      vkCmdPushConstants(cmd,
        token.pushconstantPipelineLayout
        token.pushconstantStageFlags,
        token.pushconstantOffset,
        token.pushconstantSize, data);
    break;

    VK_INDIRECT_COMMANDS_TOKEN_TYPE_INDEX_BUFFER_NV:
      VkBindIndexBufferIndirectCommandNV* data = input;

      // the indexType may optionally be remapped
      // from a custom uint32_t value, via
      // VkIndirectCommandsLayoutTokenNV::pIndexTypeValues

      vkCmdBindIndexBuffer(cmd,
        deriveBuffer(data->bufferAddress),
        deriveOffset(data->bufferAddress),
        data->indexType);
    break;

    VK_INDIRECT_COMMANDS_TOKEN_TYPE_VERTEX_BUFFER_NV:
      VkBindVertexBufferIndirectCommandNV* data = input;

      // if token.vertexDynamicStride is VK_TRUE
      // then the stride for this binding is set
      // using data->stride as well

      vkCmdBindVertexBuffers(cmd,
        token.vertexBindingUnit, 1,
        &deriveBuffer(data->bufferAddress),
        &deriveOffset(data->bufferAddress));
    break;

    VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_INDEXED_NV:
      vkCmdDrawIndexedIndirect(cmd,
        stream.buffer, offset, 1, 0);
    break;

    VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_NV:
      vkCmdDrawIndirect(cmd,
        stream.buffer,
        offset, 1, 0);
    break;

    // only available if VK_NV_mesh_shader is supported
    VK_INDIRECT_COMMANDS_TOKEN_TYPE_DISPATCH_NV:
      vkCmdDrawMeshTasksIndirectNV(cmd,
        stream.buffer, offset, 1, 0);
    break;
    }
  }
}

30.2. Indirect Commands Generation And Execution

The generation of commands on the device requires a preprocess buffer. To retrieve the memory size and alignment requirements of a particular execution state call:

void vkGetGeneratedCommandsMemoryRequirementsNV(
    VkDevice                                    device,
    const VkGeneratedCommandsMemoryRequirementsInfoNV* pInfo,
    VkMemoryRequirements2*                      pMemoryRequirements);
  • device is the logical device that owns the buffer.

  • pInfo is a pointer to an instance of the VkGeneratedCommandsMemoryRequirementsInfoNV structure containing parameters required for the memory requirements query.

  • pMemoryRequirements points to an instance of the VkMemoryRequirements2 structure in which the memory requirements of the buffer object are returned.

Valid Usage (Implicit)
typedef struct VkGeneratedCommandsMemoryRequirementsInfoNV {
    VkStructureType               sType;
    const void*                   pNext;
    VkPipelineBindPoint           pipelineBindPoint;
    VkPipeline                    pipeline;
    VkIndirectCommandsLayoutNV    indirectCommandsLayout;
    uint32_t                      maxSequencesCount;
} VkGeneratedCommandsMemoryRequirementsInfoNV;
  • sType is the type of this structure.

  • pNext is NULL or a pointer to an extension-specific structure.

  • pipelineBindPoint is the VkPipelineBindPoint of the pipeline that this buffer memory is intended to be used with during the execution.

  • pipeline is the slink::VkPipeline that this buffer memory is intended to be used with during the execution.

  • indirectCommandsLayout is the VkIndirectCommandsLayoutNV that this buffer memory is intended to be used with.

  • maxSequencesCount is the maximum number of sequences that this buffer memory in combination with the other state provided can be used with.

Valid Usage
Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_GENERATED_COMMANDS_MEMORY_REQUIREMENTS_INFO_NV

  • pNext must be NULL

The actual generation of commands as well as their execution on the device is handled as single action with:

void vkCmdExecuteGeneratedCommandsNV(
    VkCommandBuffer                             commandBuffer,
    VkBool32                                    isPreprocessed,
    const VkGeneratedCommandsInfoNV*            pGeneratedCommandsInfo);
  • commandBuffer is the command buffer into which the command is recorded.

  • isPreprocessed represents whether the input data has already been preprocessed on the device. If it is VK_FALSE this command will implicitly trigger the preprocessing step, otherwise not.

  • pGeneratedCommandsInfo is a pointer to an instance of the VkGeneratedCommandsInfoNV structure containing parameters affecting the generation of commands.

Valid Usage
  • If isPreprocessed is VK_TRUE then vkCmdPreprocessGeneratedCommandsNV must have already been executed on the device, using the same pGeneratedCommandsInfo content as well as the content of the input buffers it references (all except VkGeneratedCommandsInfoNV::preprocessBuffer). Furthermore pGeneratedCommandsInfo`s indirectCommandsLayout must have been created with the VK_INDIRECT_COMMANDS_LAYOUT_USAGE_EXPLICIT_PREPROCESS_BIT_NV bit set.

  • VkGeneratedCommandsInfoNV::pipeline must match the current bound pipeline at VkGeneratedCommandsInfoNV::pipelineBindPoint.

  • Transform feedback must not be active.

  • The VkPhysicalDeviceDeviceGeneratedCommandsFeaturesNVdeviceGeneratedCommands feature must be enabled

Valid Usage (Implicit)
  • commandBuffer must be a valid VkCommandBuffer handle

  • pGeneratedCommandsInfo must be a valid pointer to a valid VkGeneratedCommandsInfoNV structure

  • commandBuffer must be in the recording state

  • The VkCommandPool that commandBuffer was allocated from must support graphics, or compute operations

  • This command must only be called inside of a render pass instance

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Supported Queue Types Pipeline Type

Primary
Secondary

Inside

Graphics
Compute

typedef struct VkGeneratedCommandsInfoNV {
    VkStructureType                      sType;
    const void*                          pNext;
    VkPipelineBindPoint                  pipelineBindPoint;
    VkPipeline                           pipeline;
    VkIndirectCommandsLayoutNV           indirectCommandsLayout;
    uint32_t                             streamCount;
    const VkIndirectCommandsStreamNV*    pStreams;
    uint32_t                             sequencesCount;
    VkBuffer                             preprocessBuffer;
    VkDeviceSize                         preprocessOffset;
    VkDeviceSize                         preprocessSize;
    VkBuffer                             sequencesCountBuffer;
    VkDeviceSize                         sequencesCountOffset;
    VkBuffer                             sequencesIndexBuffer;
    VkDeviceSize                         sequencesIndexOffset;
} VkGeneratedCommandsInfoNV;
  • sType is the type of this structure.

  • pNext is NULL or a pointer to an extension-specific structure.

  • pipelineBindPoint is the VkPipelineBindPoint used for the pipeline.

  • pipeline is the VkPipeline used in the generation and execution process.

  • indirectCommandsLayout is the VkIndirectCommandsLayoutNV that provides the command sequence to generate.

  • streamCount defines the number of input streams

  • pStreams provides an array of VkIndirectCommandsStreamNV that provide the input data for the tokens used in indirectCommandsLayout.

  • sequencesCount is the maximum number of sequences to reserve. If sequencesCountBuffer is VK_NULL_HANDLE, this is also the actual number of sequences generated.

  • preprocessBuffer is the VkBuffer that is used for preprocessing the input data for execution. If this structure is used with vkCmdExecuteGeneratedCommandsNV with its isPreprocessed set to VK_TRUE, then the preprocessing step is skipped and data is only read from this buffer.

  • preprocessOffset is the byte offset into preprocessBuffer where the preprocessed data is stored.

  • preprocessSize is the maximum byte size within the preprocessBuffer after the preprocessOffset that is available for preprocessing.

  • sequencesCountBuffer is a VkBuffer in which the actual number of sequences is provided as single uint32_t value.

  • sequencesCountOffset is the byte offset into sequencesCountBuffer where the count value is stored.

  • sequencesIndexBuffer is a VkBuffer that encodes the used sequence indices as uint32_t array.

  • sequencesIndexOffset is the byte offset into sequencesIndexBuffer where the index values start.

Valid Usage
  • The provided pipeline must match the pipeline bound at execution time.

  • If the indirectCommandsLayout uses a token of VK_INDIRECT_COMMANDS_TOKEN_TYPE_SHADER_GROUP_NV, then the pipeline must have been created with multiple shader groups.

  • If the indirectCommandsLayout uses a token of VK_INDIRECT_COMMANDS_TOKEN_TYPE_SHADER_GROUP_NV, then the pipeline must have been created with VK_PIPELINE_CREATE_INDIRECT_BINDABLE_BIT_NV set in VkGraphicsPipelineCreateInfo::flags.

  • If the indirectCommandsLayout uses a token of VK_INDIRECT_COMMANDS_TOKEN_TYPE_PUSH_CONSTANT_NV, then the pipeline`s VkPipelineLayout must match the VkIndirectCommandsLayoutTokenNV::pushconstantPipelineLayout.

  • streamCount must match the indirectCommandsLayout’s streamCount

  • sequencesCount must be less or equal to VkPhysicalDeviceDeviceGeneratedCommandsPropertiesNV::maxIndirectSequenceCount and VkGeneratedCommandsMemoryRequirementsInfoNV::maxSequencesCount that was used to determine the preprocessSize.

  • preprocessBuffer must have the VK_BUFFER_USAGE_INDIRECT_BUFFER_BIT bit set in its usage flag.

  • preprocessOffset must be aligned to VkPhysicalDeviceDeviceGeneratedCommandsPropertiesNV::minIndirectCommandsBufferOffsetAlignment

  • preprocessSize must be at least equal to the memory requirement`s size returned by vkGetGeneratedCommandsMemoryRequirementsNV using the matching inputs (indirectCommandsLayout, …​) as within this structure.

  • sequencesCountBuffer can be set if the actual used count of sequences is sourced from the provided buffer. In that case the sequencesCount serves as upper bound.

  • If sequencesCountBuffer is used, its usage flag must have the VK_BUFFER_USAGE_INDIRECT_BUFFER_BIT bit set

  • If sequencesCountBuffer is used, sequencesCountOffset must be aligned to VkPhysicalDeviceDeviceGeneratedCommandsPropertiesNV::minSequencesCountBufferOffsetAlignment

  • sequencesIndexBuffer must be set if indirectCommandsLayout’s VK_INDIRECT_COMMANDS_LAYOUT_USAGE_INDEXED_SEQUENCES_BIT_NV is set, otherwise it must be VK_NULL_HANDLE.

  • If sequencesIndexBuffer is used, its usage flag must have the VK_BUFFER_USAGE_INDIRECT_BUFFER_BIT bit set

  • If sequencesIndexBuffer is used, sequencesIndexOffset must be aligned to VkPhysicalDeviceDeviceGeneratedCommandsPropertiesNV::minSequencesIndexBufferOffsetAlignment

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_GENERATED_COMMANDS_INFO_NV

  • pNext must be NULL

  • pipelineBindPoint must be a valid VkPipelineBindPoint value

  • pipeline must be a valid VkPipeline handle

  • indirectCommandsLayout must be a valid VkIndirectCommandsLayoutNV handle

  • pStreams must be a valid pointer to an array of streamCount valid VkIndirectCommandsStreamNV structures

  • preprocessBuffer must be a valid VkBuffer handle

  • If sequencesCountBuffer is not VK_NULL_HANDLE, sequencesCountBuffer must be a valid VkBuffer handle

  • If sequencesIndexBuffer is not VK_NULL_HANDLE, sequencesIndexBuffer must be a valid VkBuffer handle

  • streamCount must be greater than 0

  • Each of indirectCommandsLayout, pipeline, preprocessBuffer, sequencesCountBuffer, and sequencesIndexBuffer that are valid handles of non-ignored parameters must have been created, allocated, or retrieved from the same VkDevice

Referencing the functions defined in Indirect Commands Layout, vkCmdExecuteGeneratedCommandsNV behaves as:

uint32_t sequencesCount = sequencesCountBuffer ?
      min(maxSequencesCount, sequencesCountBuffer.load_uint32(sequencesCountOffset) :
      maxSequencesCount;


cmdProcessAllSequences(commandBuffer, pipeline,
                       indirectCommandsLayout, pIndirectCommandsStreams,
                       sequencesCount,
                       sequencesIndexBuffer, sequencesIndexOffset);

// The stateful commands within indirectCommandsLayout will not
// affect the state of subsequent commands in the target
// command buffer (cmd)
Note

It is important to note that all state related to the pipelineBindPoint used is considered undefined after this command.

Commands can be preprocessed prior execution using the following command:

void vkCmdPreprocessGeneratedCommandsNV(
    VkCommandBuffer                             commandBuffer,
    const VkGeneratedCommandsInfoNV*            pGeneratedCommandsInfo);
  • commandBuffer is the command buffer which does the preprocessing.

  • pGeneratedCommandsInfo is a pointer to an instance of the VkGeneratedCommandsInfoNV structure containing parameters affecting the preprocessing step.

Valid Usage
Valid Usage (Implicit)
  • commandBuffer must be a valid VkCommandBuffer handle

  • pGeneratedCommandsInfo must be a valid pointer to a valid VkGeneratedCommandsInfoNV structure

  • commandBuffer must be in the recording state

  • The VkCommandPool that commandBuffer was allocated from must support graphics, or compute operations

  • This command must only be called outside of a render pass instance

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Supported Queue Types Pipeline Type

Primary
Secondary

Outside

Graphics
Compute