Khronos Data Format Specification

Khronos Data Format Specification License Information

Copyright (C) 2014-2019 The Khronos Group Inc. All Rights Reserved.

This specification is protected by copyright laws and contains material proprietary to the Khronos Group, Inc. It or any components may not be reproduced, republished, distributed, transmitted, displayed, broadcast, or otherwise exploited in any manner without the express prior written permission of Khronos Group. You may use this specification for implementing the functionality therein, without altering or removing any trademark, copyright or other notice from the specification, but the receipt or possession of this specification does not convey any rights to reproduce, disclose, or distribute its contents, or to manufacture, use, or sell anything that it may describe, in whole or in part.

This version of the Data Format Specification is published and copyrighted by Khronos, but is not a Khronos ratified specification. Accordingly, it does not fall within the scope of the Khronos IP policy, except to the extent that sections of it are normatively referenced in ratified Khronos specifications. Such references incorporate the referenced sections into the ratified specifications, and bring those sections into the scope of the policy for those specifications.

Khronos Group grants express permission to any current Promoter, Contributor or Adopter member of Khronos to copy and redistribute UNMODIFIED versions of this specification in any fashion, provided that NO CHARGE is made for the specification and the latest available update of the specification for any version of the API is used whenever possible. Such distributed specification may be reformatted AS LONG AS the contents of the specification are not changed in any way. The specification may be incorporated into a product that is sold as long as such product includes significant independent work developed by the seller. A link to the current version of this specification on the Khronos Group website should be included whenever possible with specification distributions.

Khronos Group makes no, and expressly disclaims any, representations or warranties, express or implied, regarding this specification, including, without limitation, any implied warranties of merchantability or fitness for a particular purpose or non-infringement of any intellectual property. Khronos Group makes no, and expressly disclaims any, warranties, express or implied, regarding the correctness, accuracy, completeness, timeliness, and reliability of the specification. Under no circumstances will the Khronos Group, or any of its Promoters, Contributors or Members or their respective partners, officers, directors, employees, agents, or representatives be liable for any damages, whether direct, indirect, special or consequential damages for lost revenues, lost profits, or otherwise, arising from or in connection with these materials.

Khronos, SYCL, SPIR, WebGL, EGL, COLLADA, StreamInput, OpenVX, OpenKCam, glTF, OpenKODE, OpenVG, OpenWF, OpenSL ES, OpenMAX, OpenMAX AL, OpenMAX IL and OpenMAX DL are trademarks and WebCL is a certification mark of the Khronos Group Inc. OpenCL is a trademark of Apple Inc. and OpenGL and OpenML are registered trademarks and the OpenGL ES and OpenGL SC logos are trademarks of Silicon Graphics International used under license by Khronos. All other product names, trademarks, and/or company names are used solely for identification and belong to their respective owners.

Revision History
Revision 0.1 Jan 2015 AG
 Initial sharing 
Revision 0.2 Feb 2015 AG
 Added clarification, tables, examples 
Revision 0.3 Feb 2015 AG
 Further cleanup 
Revision 0.4 Apr 2015 AG
 Channel ordering standardized 
Revision 0.5 Apr 2015 AG
 Typos and clarification 
Revision 1.0 rev 1 May 2015 AG
 Submission for 1.0 release 
Revision 1.0 rev 2 Jun 2015 AG
 Clarifications for 1.0 release 
Revision 1.0 rev 3 Jul 2015 AG
 Added KHR_DF_SAMPLE_DATATYPE_LINEAR 
Revision 1.0 rev 4 Jul 2015 AG
 Clarified KHR_DF_SAMPLE_DATATYPE_LINEAR 
Revision 1.0 rev 5 Mar 2019 AG
 Clarification and typography 
Revision 1.1 rev 1 Nov 2015 AG
 Added definitions of compressed texture formats 
Revision 1.1 rev 2 Jan 2016 AG
 Added definitions of floating point formats 
Revision 1.1 rev 3 Feb 2016 AG
 Fixed typo in sRGB conversion (thank you, Tom Grim!) 
Revision 1.1 rev 4 Mar 2016 AG
 Fixed typo/clarified sRGB in ASTC, typographical improvements 
Revision 1.1 rev 5 Mar 2016 AG
 Switch to official Khronos logo, removed scripts, restored title 
Revision 1.1 rev 6 Jun 2016 AG
 ASTC block footprint note, fixed credits/changelog/contents 
Revision 1.1 rev 7 Sep 2016 AG
 ASTC multi-point part and quint decode typo fixes 
Revision 1.1 rev 8 Jun 2017 AG
 ETC2 legibility and table typo fix 
Revision 1.1 rev 9 Mar 2019 AG
 Typo fixes and much reformatting 
Revision 1.2 rev 0 Sep 2017 AG
 Added color conversion formulae and extra options 
Revision 1.2 rev 1 Mar 2019 AG
 Typo fixes and much reformatting 
Revision 1.3 Oct 2019 AG/MC
 Updates for KTX2/glTF. BC6h and ASTC table fixes and typo fixes. More examples. 

Table of Contents

1. Introduction
2. Formats and texel access
2.1. 1-D texel addressing
2.2. Simple 2-D texel addressing
2.3. More complex 2-D texel addressing
2.4. 3-dimensional texel addressing
2.5. Downsampled channels
3. The Khronos Data Format Descriptor overview
3.1. Texel blocks in the Khronos Data Format Descriptor
3.2. Planes in the Khronos Data Format Specification
3.3. Bit pattern interpretation and samples
3.4. Canonical representation
3.5. Related concepts outside the “format”
3.6. Translation to API-specific representations
4. Khronos Data Format Descriptor
4.1. Descriptor block
5. Khronos Basic Data Format Descriptor Block
5.1. vendorId
5.2. descriptorType
5.3. versionNumber
5.4. descriptorBlockSize
5.5. colorModel
5.6. colorModel for compressed formats
5.7. colorPrimaries
5.8. transferFunction
5.9. flags
5.10. texelBlockDimension[0..3]
5.11. bytesPlane[0..7]
5.12. Sample information
5.13. Sample bitOffset
5.14. Sample bitLength
5.15. Sample channelType and qualifiers
5.16. samplePosition[0..3]
5.17. sampleLower and sampleUpper
5.18. Paletted formats
5.19. Unsized formats
5.20. C99 struct mapping (informative)
6. Extension for more complex formats
7. Additional planes descriptor block
8. Additional dimensions descriptor block
9. Frequently Asked Questions
9.1. Why have a binary format rather than a human-readable one?
9.2. Why not use an existing representation such as those on FourCC.org?
9.3. Why have a descriptive format?
9.4. Why describe this standard within Khronos?
9.5. Why should I use this descriptor if I don’t need most of the fields?
9.6. Why not expand each field out to be integer for ease of decoding?
9.7. Can this descriptor be used for text content?
10. Floating-point formats
10.1. 16-bit floating-point numbers
10.2. Unsigned 11-bit floating-point numbers
10.3. Unsigned 10-bit floating-point numbers
10.4. Non-standard floating point formats
11. Example format descriptors
12. Introduction to color conversions
12.1. Color space composition
12.2. Operations in a color conversion
13. Transfer functions
13.1. About transfer functions (informative)
13.2. ITU transfer functions
13.3. sRGB transfer functions
13.4. BT.1886 transfer functions
13.5. BT.2100 HLG transfer functions
13.6. BT.2100 PQ transfer functions
13.7. DCI P3 transfer functions
13.8. Legacy NTSC transfer functions
13.9. Legacy PAL OETF
13.10. Legacy PAL 625-line EOTF
13.11. ST240/SMPTE240M transfer functions
13.12. Adobe RGB (1998) transfer functions
13.13. Sony S-Log transfer functions
13.14. Sony S-Log2 transfer functions
13.15. ACEScc transfer function
13.16. ACEScct transfer function
14. Color primaries
14.1. BT.709 color primaries
14.2. BT.601 625-line color primaries
14.3. BT.601 525-line color primaries
14.4. BT.2020 color primaries
14.5. NTSC 1953 color primaries
14.6. PAL 525-line analog color primaries
14.7. ACES color primaries
14.8. ACEScc color primaries
14.9. Display P3 color primaries
14.10. Adobe RGB (1998) color primaries
14.11. BT.709/BT.601 625-line primary conversion example
14.12. BT.709/BT.2020 primary conversion example
15. Color models
15.1. Y′CBCR color model
15.2. Y′CC′BCC′CR constant luminance color model
15.3. ICTCP constant intensity color model
16. Quantization schemes
16.1. “Narrow range” encoding
16.2. “Full range” encoding
16.3. Legacy “full range” encoding.
17. Compressed Texture Image Formats
17.1. Terminology
18. S3TC Compressed Texture Image Formats
18.1. BC1 with no alpha
18.2. BC1 with alpha
18.3. BC2
18.4. BC3
19. RGTC Compressed Texture Image Formats
19.1. BC4 unsigned
19.2. BC4 signed
19.3. BC5 unsigned
19.4. BC5 signed
20. BPTC Compressed Texture Image Formats
20.1. BC7
20.2. BC6H
21. ETC1 Compressed Texture Image Formats
21.1. ETC1S
22. ETC2 Compressed Texture Image Formats
22.1. Format RGB ETC2
22.2. Format RGB ETC2 with sRGB encoding
22.3. Format RGBA ETC2
22.4. Format RGBA ETC2 with sRGB encoding
22.5. Format Unsigned R11 EAC
22.6. Format Unsigned RG11 EAC
22.7. Format Signed R11 EAC
22.8. Format Signed RG11 EAC
22.9. Format RGB ETC2 with punchthrough alpha
22.10. Format RGB ETC2 with punchthrough alpha and sRGB encoding
23. ASTC Compressed Texture Image Formats
23.1. What is ASTC?
23.2. Design Goals
23.3. Basic Concepts
23.4. Block Encoding
23.5. LDR and HDR Modes
23.6. Configuration Summary
23.7. Decode Procedure
23.8. Block Determination and Bit Rates
23.9. Block Layout
23.10. Block mode
23.11. Color Endpoint Mode
23.12. Integer Sequence Encoding
23.13. Endpoint Unquantization
23.14. LDR Endpoint Decoding
23.15. HDR Endpoint Decoding
23.16. Weight Decoding
23.17. Weight Unquantization
23.18. Weight Infill
23.19. Weight Application
23.20. Dual-Plane Decoding
23.21. Partition Pattern Generation
23.22. Data Size Determination
23.23. Void-Extent Blocks
23.24. Illegal Encodings
23.25. LDR PROFILE SUPPORT
23.26. HDR PROFILE SUPPORT
24. PVRTC Compressed Texture Image Formats
24.1. PVRTC Overview
24.2. Format PVRTC1 4bpp
24.3. Format PVRTC1 2bpp
24.4. Format PVRTC2 4bpp
24.5. Format PVRTC2 2bpp
25. External references
26. Contributors

Abstract

This document describes a data format specification for non-opaque (user-visible) representations of user data to be used by, and shared between, Khronos standards. The intent of this specification is to avoid replication of incompatible format descriptions between standards and to provide a definitive mechanism for describing data that avoids excluding useful information that may be ignored by other standards. Other APIs are expected to map internal formats to this standard scheme, allowing formats to be shared and compared. This document also acts as a reference for the memory layout of a number of common compressed texture formats, and describes conversion between a number of common color spaces.

1. Introduction

Many APIs operate on bulk data — buffers, images, volumes, etc. — each composed of many elements with a fixed and often simple representation. Frequently, multiple alternative representations of data are supported: vertices can be represented with different numbers of dimensions, textures may have different bit depths and channel orders, and so on. Sometimes the representation of the data is highly specific to the application, but there are many types of data that are common to multiple APIs — and these can reasonably be described in a portable manner. In this standard, the term data format describes the representation of data.

It is typical for each API to define its own enumeration of the data formats on which it can operate. This causes a problem when multiple APIs are in use: the representations are likely to be incompatible, even where the capabilities intersect. When additional format-specific capabilities are added to an API which was designed without them, the description of the data representation often becomes inconsistent and disjoint. Concepts that are unimportant to the core design of an API may be represented simplistically or inaccurately, which can be a problem as the API is enhanced or when data is shared.

Some APIs do not have a strict definition of how to interpret their data. For example, a rendering API may treat all color channels of a texture identically, leaving the interpretation of each channel to the user’s choice of convention. This may be true even if color channels are given names that are associated with actual colors — in some APIs, nothing stops the user from storing the blue quantity in the red channel and the red quantity in the blue channel. Without enforcing a single data interpretation on such APIs, it is nonetheless often useful to offer a clear definition of the color interpretation convention that is in force, both for code maintenance and for communication with external APIs which do have a defined interpretation. Should the user wish to use an unconventional interpretation of the data, an appropriate descriptor can be defined that is specific to this choice, in order to simplify automated interpretation of the chosen representation and to provide concise documentation.

Where multiple APIs are in use, relying on an API-specific representation as an intermediary can cause loss of important information. For example, a camera API may associate color space information with a captured image, and a printer API may be able to operate with that color space, but if the data is passed through an intermediate compute API for processing and that API has no concept of a color space, the useful information may be discarded.

The intent of this standard is to provide a common, consistent, machine-readable way to describe those data formats which are amenable to non-proprietary representation. This standard provides a portable means of storing the most common descriptive information associated with data formats, and an extension mechanism that can be used when this common functionality must be supplemented.

While this standard is intended to support the description of many kinds of data, the most common class of bulk data used in Khronos standards represents color information. For this reason, the range of standard color representations used in Khronos standards is diverse, and a significant portion of this specification is devoted to color formats.

Later sections describe some of the common color space conversion operations and provide a description of the memory layout of a number of common texture compression formats.

2. Formats and texel access

This document describes a standard layout for a data structure that can be used to define the representation of simple, portable, bulk data. Using such a data structure has the following benefits:

  • Ensuring a precise description of the portable data
  • Simplifying the writing of generic functionality that acts on many types of data
  • Offering portability of data between APIs

The “bulk data” may be, for example:

  • Pixel/texel data
  • Vertex data
  • A buffer of simple type

The layout of proprietary data structures is beyond the remit of this specification, but the large number of ways to describe colors, vertices and other repeated data makes standardization useful. The widest variety of standard representations and the most common expected use of this API is to describe pixels or texels; as such the terms “texel” and “pixel” are used interchangeably in this specification when referring to elements of data, without intending to imply a restriction in use.

The data structure in this specification describes the elements in the bulk data in memory, not the layout of the whole. For example, it may describe the size, location and interpretation of color channels within a pixel, but is not responsible for determining the mapping between spatial coordinates and the location of pixels in memory. That is, two textures which share the same pixel layout can share the same descriptor as defined in this specification, but may have different sizes, line or plane strides, tiling or dimensionality; in common parlance, two images that describe (for example) color data in the same way but which are of different shapes or sizes are still described as having the same “format”.

An example pixel representation is described in Figure 1: a single 5:6:5-bit pixel composed of a blue channel in the low 5 bits, a green channel in the next 6 bits, and red channel in the top 5 bits of a 16-bit word as laid out in memory on a little-endian machine (see Table 89).

Figure 1. A simple one-texel texel block

images/565pixels.svg

2.1. 1-D texel addressing

In bulk data, each element is interpreted first by addressing it in some form, then by interpreting the addressed values. Texels often represent a color (or other data) as a multi-dimensional set of values, each representing a channel. The bulk-data image or buffer then describes a number of these texels. Taking the simplest case of an array in the C programming language as an example, a developer might define the following structure to represent a color texel:

typedef struct _MyRGB {
  unsigned char red;
  unsigned char green;
  unsigned char blue;
} MyRGB;

MyRGB *myRGBarray = (MyRGB *) malloc(100 * sizeof(MyRGB));

To determine the location of, for example, the tenth element of myRGBarray, the compiler needs to know the base address of the array and sizeof myRGB. Extracting the red, green and blue components of myRGBarray[9] given its base address is, in a sense, orthogonal to finding the base address of myRGBarray[9].

Note also that sizeof(MyRGB) will often exceed the total size of red, green and blue due to padding; the difference in address between one MyRGB and the next can be described as the pixel stride in bytes.

Figure 2. (Trivial) 1D address offsets for 1-byte elements, start of buffer

images/1DByteOffset.svg

Figure 3. 1D address offsets for 2-byte elements, start of buffer

images/1DWordOffset.svg

Figure 4. 1D address offsets for R,G,B elements (padding in gray), start of buffer

images/1DRGBOffset.svg

An alternative representation is a “structure of arrays”, distinct from the “array of structures” myRGBarray:

typedef struct _MyRGBSoA {
  unsigned char *red;
  unsigned char *green;
  unsigned char *blue;
} MyRGBSoA;

MyRGBSoA myRGBSoA;
myRGBSoA.red = (unsigned char *) malloc(100);
myRGBSoA.green = (unsigned char *) malloc(100);
myRGBSoA.blue = (unsigned char *) malloc(100);

In this case, accessing a value requires the sizeof each channel element. The best approach depends on the operations performed: calculations on one whole MyRGB a time likely favor MyRGB, those processing multiple values from a single channel may prefer MyRGBSoA. A “pixel” need not fill an entire byte — nor need pixel stride be a whole number of bytes. For example, a C++ std::vector<bool> can be considered to be a 1-D bulk data structure of individual bits.

2.2. Simple 2-D texel addressing

The simplest way to represent two-dimensional data is in consecutive rows, each representing a one-dimensional array — as with a 2-D array in C. There may be padding after each row to achieve the required alignment: in some cases each row should begin at the start of a cache line, or rows may be deliberately offset to different cache lines to ensure that vertically-adjacent values can be cached concurrently. The offset from the start of one horizontal row to the next is a line stride or row stride (or just stride for short), and is necessarily at least the width of the row. If each row holds an whole number of pixels, row stride can be described either in bytes or pixels; it is rare not to start each row on a byte boundary. In a simple 2-D representation, the row stride and the offset from the start of the storage can be described as follows:

\begin{align*} \textit{row stride}_\textit{pixels} &= \textit{width}_\textit{pixels} + \textit{padding}_\textit{pixels} \\ \textit{row stride}_\textit{bytes} &= \textit{width}_\textit{pixels} \times \textit{pixel stride}_\textit{bytes} + \textit{padding}_\textit{bytes} \\ \textit{offset}_\textit{pixels} &= x + (y \times \textit{rowstride}_\textit{pixels}) \\ \textit{address}_\textit{bytes} &= \textit{base} + (x \times \textit{pixel stride}_\textit{bytes}) + (y \times \textit{row stride}_\textit{bytes}) \end{align*}

Figure 5 shows example coordinate byte offsets for a 13×4 buffer, padding row stride to a multiple of four elements.

Figure 5. 2D linear texel offsets from coordinates

images/2DLinearOffsets.svg

Table 1. Order of byte storage in memory for coordinates in a linear 5×3 buffer, padded (italics) to 8×3

Byte

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

Coords

0,0

1,0

2,0

3,0

4,0

5,0

6,0

7,0

0,1

1,1

2,1

3,1

4,1

5,1

6,1

7,1

0,2

1,2

2,2

3,2

4,2

5,2

6,2

7,2


Figure 6. 2D R,G,B byte offsets (padding in gray) from coordinates for a 4×4 image