Appendix A: Vulkan Environment for SPIR-V
Shaders for Vulkan are defined by the Khronos SPIR-V Specification as well as the Khronos SPIR-V Extended Instructions for GLSL Specification. This appendix defines additional SPIR-V requirements applying to Vulkan shaders.
Versions and Formats
A Vulkan 1.2 implementation must support the 1.0, 1.1, 1.2, 1.3, 1.4, and 1.5 versions of SPIR-V and the 1.0 version of the SPIR-V Extended Instructions for GLSL.
A SPIR-V module passed into vkCreateShaderModule is interpreted as a series of 32-bit words in host endianness, with literal strings packed as described in section 2.2 of the SPIR-V Specification. The first few words of the SPIR-V module must be a magic number and a SPIR-V version number, as described in section 2.3 of the SPIR-V Specification.
Capabilities
The table below lists the set of SPIR-V
capabilities that may be supported in Vulkan implementations.
The application must not use any of these capabilities in SPIR-V passed to
vkCreateShaderModule unless one of the following conditions is met for
the VkDevice specified in the device
parameter of
vkCreateShaderModule:
-
The corresponding field in the table is blank.
-
Any corresponding Vulkan feature is enabled.
-
Any corresponding Vulkan extension is enabled.
-
Any corresponding Vulkan property is supported.
-
The corresponding core version is supported (as returned by VkPhysicalDeviceProperties::
apiVersion
).
The application must not pass a SPIR-V module containing any of the following to vkCreateShaderModule:
-
any
OpCapability
not listed above, -
an unsupported capability, or
-
a capability which corresponds to a Vulkan feature or extension which has not been enabled.
SPIR-V Extensions
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the following SPIR-V extensions if one of the following conditions is
met for the VkDevice specified in the device
parameter of
vkCreateShaderModule:
-
Any corresponding Vulkan extension is enabled.
-
The corresponding core version is supported (as returned by VkPhysicalDeviceProperties::
apiVersion
).
SPIR-V OpExtension Vulkan extension or core version |
---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Validation Rules within a Module
A SPIR-V module passed to vkCreateShaderModule must conform to the following rules:
Standalone SPIR-V Validation
The following rules can be validated with only the SPIR-V module itself. They do not depend on knowledge of the implementation and its capabilities or knowledge of runtime information, such as enabled features.
Runtime SPIR-V Validation
The following rules must be validated at runtime. These rules depend on knowledge of the implementation and its capabilities and knowledge of runtime information, such as enabled features.
-
If
vulkanMemoryModel
is enabled andvulkanMemoryModelDeviceScope
is not enabled, Device memory scope must not be used. -
If
vulkanMemoryModel
is not enabled, Device memory scope only extends to the queue family, not the whole device. -
If
vulkanMemoryModel
is not enabled, QueueFamily memory scope must not be used. -
if
shaderSubgroupClock
is not enabled, theSubgroup
scope must not be used forOpReadClockKHR
.-
Device
-
-
if
shaderDeviceClock
is not enabled, theDevice
scope must not be used forOpReadClockKHR
. -
The converted bit width, signedness, and numeric type of the
Image
Format
operand of anOpTypeImage
must match theSampled
Type
, as defined in Image Format and Type Matching. -
The
Result
Type
operand ofOpImageRead
must be a vector of four components. -
If shaderStorageImageWriteWithoutFormat is not enabled and an
OpTypeImage
has “Image Format” operand ofUnknown
, any variables created with the given type must be decorated withNonWritable
. -
If shaderStorageImageReadWithoutFormat is not enabled and an
OpTypeImage
has “Image Format” operand ofUnknown
, any variables created with the given type must be decorated withNonReadable
. -
Any
BuiltIn
decoration that corresponds only to Vulkan features or extensions that have not been enabled must not be used. -
OpTypeRuntimeArray
must only be used for an array of variables with storage classUniform
,StorageBuffer
, orUniformConstant
, or for the outermost dimension of an array of arrays of such variables if the runtimeDescriptorArray feature is enabled, -
If an instruction loads from or stores to a resource (including atomics and image instructions) and the resource descriptor being accessed is not dynamically uniform, then the operand corresponding to that resource (e.g. the pointer or sampled image operand) must be decorated with
NonUniform
. -
“Result Type” for Non Uniform Group Operations must be limited to 32-bit floating-point, 32-bit integer, boolean, or vectors of these types.
-
If the
Float64
capability is enabled, 64-bit floating-point and vector of 64-bit floating-point types are also permitted. -
If the
Int8
capability is enabled and the shaderSubgroupExtendedTypes feature isVK_TRUE
, 8-bit integer and vector of 8-bit integer types are also permitted. -
If the
Int16
capability is enabled and the shaderSubgroupExtendedTypes feature isVK_TRUE
, 16-bit integer and vector of 16-bit integer types are also permitted. -
If the
Int64
capability is enabled and the shaderSubgroupExtendedTypes feature isVK_TRUE
, 64-bit integer and vector of 64-bit integer types are also permitted. -
If the
Float16
capability is enabled and the shaderSubgroupExtendedTypes feature isVK_TRUE
, 16-bit floating-point and vector of 16-bit floating-point types are also permitted.
-
-
If
subgroupBroadcastDynamicId
isVK_TRUE
, and the shader module version is 1.5 or higher, the “Index” forOpGroupNonUniformQuadBroadcast
must be dynamically uniform within the derivative group. Otherwise, “Index” must be a constant. -
If
subgroupBroadcastDynamicId
isVK_TRUE
, and the shader module version is 1.5 or higher, the “Id” forOpGroupNonUniformBroadcast
must be dynamically uniform within the subgroup. Otherwise, “Id” must be a constant. -
shaderBufferInt64Atomics must be enabled for 64-bit integer atomic operations to be supported on a Pointer with a Storage Class of StorageBuffer or Uniform.
-
shaderSharedInt64Atomics must be enabled for 64-bit integer atomic operations to be supported on a Pointer with a Storage Class of Workgroup.
-
shaderBufferFloat32Atomics or shaderBufferFloat32AtomicAdd or shaderBufferFloat64Atomics or shaderBufferFloat64AtomicAdd must be enabled for floating-point atomic operations to be supported on a Pointer with a Storage Class of StorageBuffer.
-
shaderSharedFloat32Atomics or shaderSharedFloat32AtomicAdd or shaderSharedFloat64Atomics or shaderSharedFloat64AtomicAdd must be enabled for floating-point atomic operations to be supported on a Pointer with a Storage Class of Workgroup.
-
shaderImageFloat32Atomics or shaderImageFloat32AtomicAdd must be enabled for 32-bit floating-point atomic operations to be supported on a Pointer with a Storage Class of Image.
-
sparseImageFloat32Atomics or sparseImageFloat32AtomicAdd must be enabled for 32-bit floating-point atomics to be supported on sparse images.
-
shaderImageInt64Atomics must be enabled for 64-bit integer atomic operations to be supported on a Pointer with a Storage Class of Image.
-
-
If
denormBehaviorIndependence
isVK_SHADER_FLOAT_CONTROLS_INDEPENDENCE_32_BIT_ONLY
, then the entry point must use the same denormals execution mode for both 16-bit and 64-bit floating-point types. -
If
denormBehaviorIndependence
isVK_SHADER_FLOAT_CONTROLS_INDEPENDENCE_NONE
, then the entry point must use the same denormals execution mode for all floating-point types. -
If
roundingModeIndependence
isVK_SHADER_FLOAT_CONTROLS_INDEPENDENCE_32_BIT_ONLY
, then the entry point must use the same rounding execution mode for both 16-bit and 64-bit floating-point types. -
If
roundingModeIndependence
isVK_SHADER_FLOAT_CONTROLS_INDEPENDENCE_NONE
, then the entry point must use the same rounding execution mode for all floating-point types. -
If
shaderSignedZeroInfNanPreserveFloat16
isVK_FALSE
, thenSignedZeroInfNanPreserve
for 16-bit floating-point type must not be used. -
If
shaderSignedZeroInfNanPreserveFloat32
isVK_FALSE
, thenSignedZeroInfNanPreserve
for 32-bit floating-point type must not be used. -
If
shaderSignedZeroInfNanPreserveFloat64
isVK_FALSE
, thenSignedZeroInfNanPreserve
for 64-bit floating-point type must not be used. -
If
shaderDenormPreserveFloat16
isVK_FALSE
, thenDenormPreserve
for 16-bit floating-point type must not be used. -
If
shaderDenormPreserveFloat32
isVK_FALSE
, thenDenormPreserve
for 32-bit floating-point type must not be used. -
If
shaderDenormPreserveFloat64
isVK_FALSE
, thenDenormPreserve
for 64-bit floating-point type must not be used. -
If
shaderDenormFlushToZeroFloat16
isVK_FALSE
, thenDenormFlushToZero
for 16-bit floating-point type must not be used. -
If
shaderDenormFlushToZeroFloat32
isVK_FALSE
, thenDenormFlushToZero
for 32-bit floating-point type must not be used. -
If
shaderDenormFlushToZeroFloat64
isVK_FALSE
, thenDenormFlushToZero
for 64-bit floating-point type must not be used. -
If
shaderRoundingModeRTEFloat16
isVK_FALSE
, thenRoundingModeRTE
for 16-bit floating-point type must not be used. -
If
shaderRoundingModeRTEFloat32
isVK_FALSE
, thenRoundingModeRTE
for 32-bit floating-point type must not be used. -
If
shaderRoundingModeRTEFloat64
isVK_FALSE
, thenRoundingModeRTE
for 64-bit floating-point type must not be used. -
If
shaderRoundingModeRTZFloat16
isVK_FALSE
, thenRoundingModeRTZ
for 16-bit floating-point type must not be used. -
If
shaderRoundingModeRTZFloat32
isVK_FALSE
, thenRoundingModeRTZ
for 32-bit floating-point type must not be used. -
If
shaderRoundingModeRTZFloat64
isVK_FALSE
, thenRoundingModeRTZ
for 64-bit floating-point type must not be used. -
The
Offset
plus size of the type of each variable, in the output interface of the entry point being compiled, decorated withXfbBuffer
must not be greater thanVkPhysicalDeviceTransformFeedbackPropertiesEXT
::maxTransformFeedbackBufferDataSize
-
For any given
XfbBuffer
value, define the buffer data size to be smallest number of bytes such that, for all outputs decorated with the sameXfbBuffer
value, the size of the output interface variable plus theOffset
is less than or equal to the buffer data size. For a givenStream
, the sum of all the buffer data sizes for all buffers writing to that stream the must not exceedVkPhysicalDeviceTransformFeedbackPropertiesEXT
::maxTransformFeedbackStreamDataSize
-
The Stream value to
OpEmitStreamVertex
andOpEndStreamPrimitive
must be less thanVkPhysicalDeviceTransformFeedbackPropertiesEXT
::maxTransformFeedbackStreams
-
If the geometry shader emits to more than one vertex stream and
VkPhysicalDeviceTransformFeedbackPropertiesEXT
::transformFeedbackStreamsLinesTriangles
isVK_FALSE
, then execution mode must beOutputPoints
-
The stream number value to
Stream
must be less thanVkPhysicalDeviceTransformFeedbackPropertiesEXT
::maxTransformFeedbackStreams
-
The XFB Stride value to
XfbStride
must be less than or equal toVkPhysicalDeviceTransformFeedbackPropertiesEXT
::maxTransformFeedbackBufferDataStride
-
If the
PhysicalStorageBuffer64
addressing model is enabled any load or store through a physical pointer type must be aligned to a multiple of the size of the largest scalar type in the pointed-to type. -
If the
PhysicalStorageBuffer64
addressing model is enabled the pointer value of a memory access instruction must be at least as aligned as specified by theAligned
memory access operand. -
For
OpTypeCooperativeMatrixNV
, the component type, scope, number of rows, and number of columns must match one of the matrices in any of the supported VkCooperativeMatrixPropertiesNV. -
For
OpCooperativeMatrixMulAddNV
, theResult
,A
,B
, andC
matrices must all have types that satisfy the same supported VkCooperativeMatrixPropertiesNV. That is, for one supported VkCooperativeMatrixPropertiesNV, all of the following must hold:-
The type of
A
must haveMSize
rows andKSize
columns and have a component type that matchesAType
. -
The type of
B
must haveKSize
rows andNSize
columns and have a component type that matchesBType
. -
The type of
C
must haveMSize
rows andNSize
columns and have a component type that matchesCType
. -
The type of
Result
must haveMSize
rows andNSize
columns and have a component type that matchesDType
. -
The type of
A
,B
,C
, andResult
must all have a scope ofscope
.
-
-
OpTypeCooperativeMatrixNV
andOpCooperativeMatrix
* instructions must not be used in shader stages not included in VkPhysicalDeviceCooperativeMatrixPropertiesNV::cooperativeMatrixSupportedStages
. -
DescriptorSet
andBinding
decorations must obey the constraints on storage class, type, and descriptor type described in DescriptorSet and Binding Assignment -
For
OpCooperativeMatrixLoadNV
andOpCooperativeMatrixStoreNV
instructions, thePointer
andStride
operands must be aligned to at least the lesser of 16 bytes or the natural alignment of a row or column (depending onColumnMajor
) of the matrix (where the natural alignment is the number of columns/rows multiplied by the component size). -
For compute shaders using the
DerivativeGroupLinearNV
execution mode, the product of the dimensions of the local workgroup size must be a multiple of four. -
If the
VK_KHR_portability_subset
extension is enabled, and VkPhysicalDevicePortabilitySubsetFeaturesKHR::shaderSampleRateInterpolationFunctions
isVK_FALSE
, thenGLSL.std.450
fragment interpolation functions are not supported by the implementation andOpCapability
must not be set toInterpolationFunction
. -
If
tessellationShader
is enabled, and theVK_KHR_portability_subset
extension is enabled, and VkPhysicalDevicePortabilitySubsetFeaturesKHR::tessellationIsolines
isVK_FALSE
, thenOpExecutionMode
must not be set toIsoLines
. -
If
tessellationShader
is enabled, and theVK_KHR_portability_subset
extension is enabled, and VkPhysicalDevicePortabilitySubsetFeaturesKHR::tessellationPointMode
isVK_FALSE
, thenOpExecutionMode
must not be set toPointMode
. -
If
storageBuffer8BitAccess
isVK_FALSE
, then objects containing an 8-bit integer element must not have storage class of StorageBuffer, ShaderRecordBufferKHR, or PhysicalStorageBuffer. -
If
uniformAndStorageBuffer8BitAccess
isVK_FALSE
, then objects in the Uniform storage class with the Block decoration and in the StorageBuffer, ShaderRecordBufferKHR, or PhysicalStorageBuffer storage class with the same decoration must not have an 8-bit integer member. -
If
storagePushConstant8
isVK_FALSE
, then objects containing an 8-bit integer element must not have storage class of PushConstant. -
If
storageBuffer16BitAccess
isVK_FALSE
, then objects containing a 16-bit integer element must not have storage class of StorageBuffer, ShaderRecordBufferKHR, or PhysicalStorageBuffer. -
If
uniformAndStorageBuffer16BitAccess
isVK_FALSE
, then objects in the Uniform storage class with the Block decoration and in the StorageBuffer, ShaderRecordBufferKHR, or PhysicalStorageBuffer storage class with the same decoration must not have a 16-bit integer member. -
If
storagePushConstant16
isVK_FALSE
, then objects containing a 16-bit integer element must not have storage class of PushConstant. -
If
storageInputOutput16
isVK_FALSE
, then objects containing a 16-bit integer element must not have storage class of Input or Output. -
Atomic instructions must declare a scalar 32-bit integer type, or a scalar 32-bit floating-point type if the shaderBufferFloat32Atomics or shaderBufferFloat32AtomicAdd or shaderSharedFloat32Atomics or shaderSharedFloat32AtomicAdd or shaderImageFloat32Atomics or shaderImageFloat32AtomicAdd or sparseImageFloat32Atomics or sparseImageFloat32AtomicAdd is enabled, or a scalar 64-bit floating-point type if the shaderBufferFloat64Atomics or shaderBufferFloat64AtomicAdd or shaderSharedFloat64Atomics or shaderSharedFloat64AtomicAdd is enabled, or a scalar 64-bit integer type if the
Int64Atomics
capability is enabled, for the value pointed to by Pointer. -
If fragmentStoresAndAtomics is not enabled, then all storage image, storage texel buffer, and storage buffer variables in the fragment stage must be decorated with the
NonWritable
decoration. -
If vertexPipelineStoresAndAtomics is not enabled, then all storage image, storage texel buffer, and storage buffer variables in the vertex, tessellation, and geometry stages must be decorated with the
NonWritable
decoration. -
If subgroupQuadOperationsInAllStages is
VK_FALSE
, then quad subgroup operations must not be used except for in fragment and compute stages. -
Group operations with subgroup scope must not be used if the shader stage is not in subgroupSupportedStages.
-
The first element of the
Offset
operand ofInterpolateAtOffset
must be greater than or equal to:-
fragwidth ×
minInterpolationOffset
where fragwidth is the width of the current fragment in pixels.
-
-
The first element of the
Offset
operand ofInterpolateAtOffset
must be less than or equal to:-
fragwidth × (
maxInterpolationOffset
+ ULP ) - ULP
where fragwidth is the width of the current fragment in pixels and ULP = 1 / 2
subPixelInterpolationOffsetBits
. -
-
The second element of the
Offset
operand ofInterpolateAtOffset
must be greater than or equal to:-
fragheight ×
minInterpolationOffset
where fragheight is the height of the current fragment in pixels.
-
-
The second element of the
Offset
operand ofInterpolateAtOffset
must be less than or equal to:-
fragheight × (
maxInterpolationOffset
+ ULP ) - ULP
where fragheight is the height of the current fragment in pixels and ULP = 1 / 2
subPixelInterpolationOffsetBits
. -
-
For
OpRayQueryInitializeKHR
instructions:-
All components of the
Ray
Origin
andRay
Direction
operands must be finite floating-point values. -
The
Ray
Tmin
andRay
Tmax
operands must be non-negative floating-point values. -
The
Ray
Tmin
operand must be less than or equal to theRay
Tmax
operand. -
The above operands must not contain NaNs.
-
Acceleration
Structure
must be an acceleration structure built as a top-level acceleration structure.
-
-
For
OpRayQueryGenerateIntersectionKHR
instructions:-
Hit
T
must satisfy the conditionRay
Tmin
≤Hit
T
≤Ray
Tmax
, whereRay
Tmin
is equal to the value returned byOpRayQueryGetRayTMinKHR
with the same ray query object, andRay
Tmax
is equal to the value ofOpRayQueryGetIntersectionTKHR
for the current committed intersection with the same ray query object.
-
-
For
OpTraceRayKHR
instructions:-
All components of the
Ray
Origin
andRay
Direction
operands must be finite floating-point values. -
The
Ray
Tmin
andRay
Tmax
operands must be non-negative floating-point values. -
The
Ray
Tmin
operand must be less than or equal to theRay
Tmax
operand. -
The above operands must not contain NaNs.
-
Acceleration
Structure
must be an acceleration structure built as a top-level acceleration structure.
-
-
The
x
size inLocalSize
must be less than or equal toVkPhysicalDeviceLimits
::maxComputeWorkGroupSize
[0] -
The
y
size inLocalSize
must be less than or equal toVkPhysicalDeviceLimits
::maxComputeWorkGroupSize
[1] -
The
z
size inLocalSize
must be less than or equal toVkPhysicalDeviceLimits
::maxComputeWorkGroupSize
[2] -
The product of
x
size,y
size, andz
size inLocalSize
must be less than or equal toVkPhysicalDeviceLimits
::maxComputeWorkGroupInvocations
Precision and Operation of SPIR-V Instructions
The following rules apply to half, single, and double-precision floating point instructions:
-
Positive and negative infinities and positive and negative zeros are generated as dictated by IEEE 754, but subject to the precisions allowed in the following table.
-
Dividing a non-zero by a zero results in the appropriately signed IEEE 754 infinity.
-
Signaling NaNs are not required to be generated and exceptions are never raised. Signaling NaN may be converted to quiet NaNs values by any floating point instruction.
-
By default, the implementation may perform optimizations on half, single, or double-precision floating-point instructions that ignore sign of a zero, or assume that arguments and results are not NaNs or infinities. If the entry point is declared with the
SignedZeroInfNanPreserve
execution mode, then NaNs, infinities, and the sign of zero must not be ignored.-
The following core SPIR-V instructions must respect the
SignedZeroInfNanPreserve
execution mode:OpPhi
,OpSelect
,OpReturnValue
,OpVectorExtractDynamic
,OpVectorInsertDynamic
,OpVectorShuffle
,OpCompositeConstruct
,OpCompositeExtract
,OpCompositeInsert
,OpCopyObject
,OpTranspose
,OpFConvert
,OpFNegate
,OpFAdd
,OpFSub
,OpFMul
,OpStore
. This execution mode must also be respected byOpLoad
except for loads from theInput
storage class in the fragment shader stage with the floating-point result type. Other SPIR-V instructions may also respect theSignedZeroInfNanPreserve
execution mode.
-
-
The following instructions must not flush denormalized values:
OpConstant
,OpConstantComposite
,OpSpecConstant
,OpSpecConstantComposite
,OpLoad
,OpStore
,OpBitcast
,OpPhi
,OpSelect
,OpFunctionCall
,OpReturnValue
,OpVectorExtractDynamic
,OpVectorInsertDynamic
,OpVectorShuffle
,OpCompositeConstruct
,OpCompositeExtract
,OpCompositeInsert
,OpCopyMemory
,OpCopyObject
. -
Denormalized values are supported.
-
By default, any half, single, or double-precision denormalized value input into a shader or potentially generated by any instruction (except those listed above) or any extended instructions for GLSL in a shader may be flushed to zero.
-
If the entry point is declared with the
DenormFlushToZero
execution mode then for the affected instuctions the denormalized result must be flushed to zero and the denormalized operands may be flushed to zero. Denormalized values obtained via unpacking an integer into a vector of values with smaller bit width and interpreting those values as floating-point numbers must be flushed to zero. -
The following core SPIR-V instructions must respect the
DenormFlushToZero
execution mode:OpSpecConstantOp
(with opcodeOpFConvert
),OpFConvert
,OpFNegate
,OpFAdd
,OpFSub
,OpFMul
,OpFDiv
,OpFRem
,OpFMod
,OpVectorTimesScalar
,OpMatrixTimesScalar
,OpVectorTimesMatrix
,OpMatrixTimesVector
,OpMatrixTimesMatrix
,OpOuterProduct
,OpDot
; and the following extended instructions for GLSL:Round
,RoundEven
,Trunc
,FAbs
,Floor
,Ceil
,Fract
,Radians
,Degrees
,Sin
,Cos
,Tan
,Asin
,Acos
,Atan
,Sinh
,Cosh
,Tanh
,Asinh
,Acosh
,Atanh
,Atan2
,Pow
,Exp
,Log
,Exp2
,Log2
,Sqrt
,InverseSqrt
,Determinant
,MatrixInverse
,Modf
,ModfStruct
,FMin
,FMax
,FClamp
,FMix
,Step
,SmoothStep
,Fma
,UnpackHalf2x16
,UnpackDouble2x32
,Length
,Distance
,Cross
,Normalize
,FaceForward
,Reflect
,Refract
,NMin
,NMax
,NClamp
. Other SPIR-V instructions (except those excluded above) may also flush denormalized values. -
The following core SPIR-V instructions must respect the
DenormPreserve
execution mode:OpTranspose
,OpSpecConstantOp
,OpFConvert
,OpFNegate
,OpFAdd
,OpFSub
,OpFMul
,OpVectorTimesScalar
,OpMatrixTimesScalar
,OpVectorTimesMatrix
,OpMatrixTimesVector
,OpMatrixTimesMatrix
,OpOuterProduct
,OpDot
,OpFOrdEqual
,OpFUnordEqual
,OpFOrdNotEqual
,OpFUnordNotEqual
,OpFOrdLessThan
,OpFUnordLessThan
,OpFOrdGreaterThan
,OpFUnordGreaterThan
,OpFOrdLessThanEqual
,OpFUnordLessThanEqual
,OpFOrdGreaterThanEqual
,OpFUnordGreaterThanEqual
; and the following extended instructions for GLSL:FAbs
,FSign
,Radians
,Degrees
,FMin
,FMax
,FClamp
,FMix
,Fma
,PackHalf2x16
,PackDouble2x32
,UnpackHalf2x16
,UnpackDouble2x32
,NMin
,NMax
,NClamp
. Other SPIR-V instructions may also preserve denorm values.
-
The precision of double-precision instructions is at least that of single precision.
The precision of operations is defined either in terms of rounding, as an error bound in ULP, or as inherited from a formula as follows.
Operations described as “correctly rounded” will return the infinitely
precise result, x, rounded so as to be representable in
floating-point.
The rounding mode is not specified, unless the entry point is declared with
the RoundingModeRTE
or the RoundingModeRTZ
execution mode.
These execution modes affect only correctly rounded SPIR-V instructions.
These execution modes do not affect OpQuantizeToF16
.
If the rounding mode is not specified then this rounding is implementation
specific, subject to the following rules.
If x is exactly representable then x will be returned.
Otherwise, either the floating-point value closest to and no less than
x or the value closest to and no greater than x will be
returned.
Where an error bound of n ULP (units in the last place) is given, for an operation with infinitely precise result x the value returned must be in the range [x - n × ulp(x), x + n × ulp(x)]. The function ulp(x) is defined as follows:
-
If there exist non-equal floating-point numbers a and b such that a ≤ x ≤ b then ulp(x) is the minimum possible distance between such numbers, . If such numbers do not exist then ulp(x) is defined to be the difference between the two finite floating-point numbers nearest to x.
Where the range of allowed return values includes any value of magnitude larger than that of the largest representable finite floating-point number, operations may, additionally, return either an infinity of the appropriate sign or the finite number with the largest magnitude of the appropriate sign. If the infinitely precise result of the operation is not mathematically defined then the value returned is undefined.
Where an operation’s precision is described as being inherited from a
formula, the result returned must be at least as accurate as the result of
computing an approximation to x using a formula equivalent to the
given formula applied to the supplied inputs.
Specifically, the formula given may be transformed using the mathematical
associativity, commutativity and distributivity of the operators involved to
yield an equivalent formula.
The SPIR-V precision rules, when applied to each such formula and the given
input values, define a range of permitted values.
If NaN is one of the permitted values then the operation may return
any result, otherwise let the largest permitted value in any of the ranges
be Fmax and the smallest be Fmin.
The operation must return a value in the range [x - E, x + E]
where .
If the entry point is declared with the DenormFlushToZero
execution
mode, then any intermediate denormal value(s) while evaluating the formula
may be flushed to zero.
Denormal final results must be flushed to zero.
If the entry point is declared with the DenormPreserve
execution mode,
then denormals must be preserved throughout the formula.
For half- (16 bit) and single- (32 bit) precision instructions, precisions are required to be at least as follows:
Instruction | Single precision, unless decorated with RelaxedPrecision | Half precision |
---|---|---|
|
Correctly rounded. |
|
|
Correctly rounded. |
|
|
Correctly rounded. |
|
|
Inherited from . |
|
|
Correct result. |
|
|
Correct result. |
|
|
Correct result. |
|
|
Correct result. |
|
|
Correct result. |
|
|
2.5 ULP for |y| in the range [2-126, 2126]. |
2.5 ULP for |y| in the range [2-14, 214]. |
|
Inherited from x - y × trunc(x/y). |
|
|
Inherited from x - y × floor(x/y). |
|
conversions between types |
Correctly rounded. |
Note
The |
Instruction | Single precision, unless decorated with RelaxedPrecision | Half precision |
---|---|---|
|
Inherited from |
|
|
ULP. |
ULP. |
|
3 ULP outside the range . Absolute error < inside the range . |
3 ULP outside the range . Absolute error < inside the range . |
|
Inherited from |
|
|
Inherited from 1.0 / |
|
|
2 ULP. |
|
|
Inherited from . |
|
|
Inherited from . |
|
|
Absolute error inside the range . |
Absolute error inside the range . |
|
Absolute error inside the range . |
Absolute error inside the range . |
|
Inherited from . |
|
|
Inherited from . |
|
|
Inherited from . |
|
|
4096 ULP |
5 ULP. |
|
Inherited from . |
|
|
Inherited from . |
|
|
Inherited from . |
|
|
Inherited from . |
|
|
Inherited from . |
|
|
Inherited from . |
|
|
Correctly rounded. |
|
|
Correctly rounded. |
|
|
Inherited from . |
|
|
Inherited from . |
|
|
Inherited from |
|
|
Inherited from . |
|
|
Inherited from |
|
|
Inherited from x - 2.0 × |
|
|
Inherited from k < 0.0 ? 0.0 : eta × I - (eta × |
|
|
Correctly rounded. |
|
|
Correctly rounded. |
|
|
Correctly rounded. |
|
|
Correctly rounded. |
|
|
Correctly rounded. |
|
|
Correctly rounded. |
|
|
Correctly rounded. |
|
|
Correctly rounded. |
|
|
Correctly rounded. |
|
|
Correctly rounded. |
|
|
Correctly rounded. |
|
|
Correctly rounded. |
|
|
Inherited from . |
|
|
Correctly rounded. |
|
|
Inherited from , where . |
|
|
Correctly rounded. |
|
|
Correctly rounded. |
|
|
Correctly rounded. |
GLSL.std.450 extended instructions specifically defined in terms of the above instructions inherit the above errors. GLSL.std.450 extended instructions not listed above and not defined in terms of the above have undefined precision.
For the OpSRem
and OpSMod
instructions, if either operand is
negative the result is undefined.
Note
While the |
OpCooperativeMatrixMulAddNV
performs its operations in an
implementation-dependent order and internal precision.
Image Format and Type Matching
When specifying the Image
Format
as anything other than
Unknown
, the converted bit width, type, and signedness as shown in the
table below, must match the Sampled
Type
.
Note
Formatted accesses are always converted from a shader readable type to the resource’s format or vice versa via Format Conversion for reads and Texel Output Format Conversion for writes. As such, the bit width and format below do not necessarily match 1:1 with what might be expected for some formats. |
For a given Image
Format
, the Sampled
Type
must be the
type described in the Type column of the below table, with its
Literal
Width
set to that in the Bit Width column, and its
Literal
Signedness
to that in the Signedness column (where
applicable).
Image Format | Type | Bit Width | Signedness |
---|---|---|---|
|
Any |
Any |
Any |
|
|
32 |
N/A |
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|
32 |
1 |
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
0 |
||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
Compatibility Between SPIR-V Image Formats And Vulkan Formats
SPIR-V Image
Format
values are compatible with VkFormat
values as defined below:
SPIR-V Image Format | Compatible Vulkan Format |
---|---|
|
Any |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|