## Khronos Data Format Specification

Khronos Data Format Specification License Information

This specification is protected by copyright laws and contains material proprietary to the Khronos Group, Inc. It or any components may not be reproduced, republished, distributed, transmitted, displayed, broadcast, or otherwise exploited in any manner without the express prior written permission of Khronos Group. You may use this specification for implementing the functionality therein, without altering or removing any trademark, copyright or other notice from the specification, but the receipt or possession of this specification does not convey any rights to reproduce, disclose, or distribute its contents, or to manufacture, use, or sell anything that it may describe, in whole or in part.

This version of the Data Format Specification is published and copyrighted by Khronos, but is not a Khronos ratified specification. Accordingly, it does not fall within the scope of the Khronos IP policy, except to the extent that sections of it are normatively referenced in ratified Khronos specifications. Such references incorporate the referenced sections into the ratified specifications, and bring those sections into the scope of the policy for those specifications.

Khronos Group grants express permission to any current Promoter, Contributor or Adopter member of Khronos to copy and redistribute UNMODIFIED versions of this specification in any fashion, provided that NO CHARGE is made for the specification and the latest available update of the specification for any version of the API is used whenever possible. Such distributed specification may be reformatted AS LONG AS the contents of the specification are not changed in any way. The specification may be incorporated into a product that is sold as long as such product includes significant independent work developed by the seller. A link to the current version of this specification on the Khronos Group website should be included whenever possible with specification distributions.

Khronos Group makes no, and expressly disclaims any, representations or warranties, express or implied, regarding this specification, including, without limitation, any implied warranties of merchantability or fitness for a particular purpose or non-infringement of any intellectual property. Khronos Group makes no, and expressly disclaims any, warranties, express or implied, regarding the correctness, accuracy, completeness, timeliness, and reliability of the specification. Under no circumstances will the Khronos Group, or any of its Promoters, Contributors or Members or their respective partners, officers, directors, employees, agents, or representatives be liable for any damages, whether direct, indirect, special or consequential damages for lost revenues, lost profits, or otherwise, arising from or in connection with these materials.

Khronos, SYCL, SPIR, WebGL, EGL, COLLADA, StreamInput, OpenVX, OpenKCam, glTF, OpenKODE, OpenVG, OpenWF, OpenSL ES, OpenMAX, OpenMAX AL, OpenMAX IL and OpenMAX DL are trademarks and WebCL is a certification mark of the Khronos Group Inc. OpenCL is a trademark of Apple Inc. and OpenGL and OpenML are registered trademarks and the OpenGL ES and OpenGL SC logos are trademarks of Silicon Graphics International used under license by Khronos. All other product names, trademarks, and/or company names are used solely for identification and belong to their respective owners.

Revision History
Revision 0.1 Jan 2015 AG
Initial sharing
Revision 0.2 Feb 2015 AG
Revision 0.3 Feb 2015 AG
Further cleanup
Revision 0.4 Apr 2015 AG
Channel ordering standardized
Revision 0.5 Apr 2015 AG
Typos and clarification
Revision 1.0 rev 1 May 2015 AG
Submission for 1.0 release
Revision 1.0 rev 2 Jun 2015 AG
Clarifications for 1.0 release
Revision 1.0 rev 3 Jul 2015 AG
Revision 1.0 rev 4 Jul 2015 AG
Clarified KHR_DF_SAMPLE_DATATYPE_LINEAR
Revision 1.0 rev 5 Mar 2019 AG
Clarification and typography
Revision 1.1 rev 1 Nov 2015 AG
Added definitions of compressed texture formats
Revision 1.1 rev 2 Jan 2016 AG
Added definitions of floating point formats
Revision 1.1 rev 3 Feb 2016 AG
Fixed typo in sRGB conversion (thank you, Tom Grim!)
Revision 1.1 rev 4 Mar 2016 AG
Fixed typo/clarified sRGB in ASTC, typographical improvements
Revision 1.1 rev 5 Mar 2016 AG
Switch to official Khronos logo, removed scripts, restored title
Revision 1.1 rev 6 Jun 2016 AG
ASTC block footprint note, fixed credits/changelog/contents
Revision 1.1 rev 7 Sep 2016 AG
ASTC multi-point part and quint decode typo fixes
Revision 1.1 rev 8 Jun 2017 AG
ETC2 legibility and table typo fix
Revision 1.1 rev 9 Mar 2019 AG
Typo fixes and much reformatting
Revision 1.2 rev 0 Sep 2017 AG
Added color conversion formulae and extra options
Revision 1.2 rev 1 Mar 2019 AG
Typo fixes and much reformatting
Revision 1.3 Oct 2019 AG/MC
Updates for KTX2/glTF. BC6h and ASTC table fixes and typo fixes. More examples.

Abstract

This document describes a data format specification for non-opaque (user-visible) representations of user data to be used by, and shared between, Khronos standards. The intent of this specification is to avoid replication of incompatible format descriptions between standards and to provide a definitive mechanism for describing data that avoids excluding useful information that may be ignored by other standards. Other APIs are expected to map internal formats to this standard scheme, allowing formats to be shared and compared. This document also acts as a reference for the memory layout of a number of common compressed texture formats, and describes conversion between a number of common color spaces.

## 1. Introduction

Many APIs operate on bulk data — buffers, images, volumes, etc. — each composed of many elements with a fixed and often simple representation. Frequently, multiple alternative representations of data are supported: vertices can be represented with different numbers of dimensions, textures may have different bit depths and channel orders, and so on. Sometimes the representation of the data is highly specific to the application, but there are many types of data that are common to multiple APIs — and these can reasonably be described in a portable manner. In this standard, the term data format describes the representation of data.

It is typical for each API to define its own enumeration of the data formats on which it can operate. This causes a problem when multiple APIs are in use: the representations are likely to be incompatible, even where the capabilities intersect. When additional format-specific capabilities are added to an API which was designed without them, the description of the data representation often becomes inconsistent and disjoint. Concepts that are unimportant to the core design of an API may be represented simplistically or inaccurately, which can be a problem as the API is enhanced or when data is shared.

Some APIs do not have a strict definition of how to interpret their data. For example, a rendering API may treat all color channels of a texture identically, leaving the interpretation of each channel to the user’s choice of convention. This may be true even if color channels are given names that are associated with actual colors — in some APIs, nothing stops the user from storing the blue quantity in the red channel and the red quantity in the blue channel. Without enforcing a single data interpretation on such APIs, it is nonetheless often useful to offer a clear definition of the color interpretation convention that is in force, both for code maintenance and for communication with external APIs which do have a defined interpretation. Should the user wish to use an unconventional interpretation of the data, an appropriate descriptor can be defined that is specific to this choice, in order to simplify automated interpretation of the chosen representation and to provide concise documentation.

Where multiple APIs are in use, relying on an API-specific representation as an intermediary can cause loss of important information. For example, a camera API may associate color space information with a captured image, and a printer API may be able to operate with that color space, but if the data is passed through an intermediate compute API for processing and that API has no concept of a color space, the useful information may be discarded.

The intent of this standard is to provide a common, consistent, machine-readable way to describe those data formats which are amenable to non-proprietary representation. This standard provides a portable means of storing the most common descriptive information associated with data formats, and an extension mechanism that can be used when this common functionality must be supplemented.

While this standard is intended to support the description of many kinds of data, the most common class of bulk data used in Khronos standards represents color information. For this reason, the range of standard color representations used in Khronos standards is diverse, and a significant portion of this specification is devoted to color formats.

Later sections describe some of the common color space conversion operations and provide a description of the memory layout of a number of common texture compression formats.

## 2. Formats and texel access

This document describes a standard layout for a data structure that can be used to define the representation of simple, portable, bulk data. Using such a data structure has the following benefits:

• Ensuring a precise description of the portable data
• Simplifying the writing of generic functionality that acts on many types of data
• Offering portability of data between APIs

The “bulk data” may be, for example:

• Pixel/texel data
• Vertex data
• A buffer of simple type

The layout of proprietary data structures is beyond the remit of this specification, but the large number of ways to describe colors, vertices and other repeated data makes standardization useful. The widest variety of standard representations and the most common expected use of this API is to describe pixels or texels; as such the terms “texel” and “pixel” are used interchangeably in this specification when referring to elements of data, without intending to imply a restriction in use.

The data structure in this specification describes the elements in the bulk data in memory, not the layout of the whole. For example, it may describe the size, location and interpretation of color channels within a pixel, but is not responsible for determining the mapping between spatial coordinates and the location of pixels in memory. That is, two textures which share the same pixel layout can share the same descriptor as defined in this specification, but may have different sizes, line or plane strides, tiling or dimensionality; in common parlance, two images that describe (for example) color data in the same way but which are of different shapes or sizes are still described as having the same “format”.

An example pixel representation is described in Figure 1: a single 5:6:5-bit pixel composed of a blue channel in the low 5 bits, a green channel in the next 6 bits, and red channel in the top 5 bits of a 16-bit word as laid out in memory on a little-endian machine (see Table 89).

In bulk data, each element is interpreted first by addressing it in some form, then by interpreting the addressed values. Texels often represent a color (or other data) as a multi-dimensional set of values, each representing a channel. The bulk-data image or buffer then describes a number of these texels. Taking the simplest case of an array in the C programming language as an example, a developer might define the following structure to represent a color texel:

typedef struct _MyRGB {
unsigned char red;
unsigned char green;
unsigned char blue;
} MyRGB;

MyRGB *myRGBarray = (MyRGB *) malloc(100 * sizeof(MyRGB));

To determine the location of, for example, the tenth element of myRGBarray, the compiler needs to know the base address of the array and sizeof myRGB. Extracting the red, green and blue components of myRGBarray[9] given its base address is, in a sense, orthogonal to finding the base address of myRGBarray[9].

Note also that sizeof(MyRGB) will often exceed the total size of red, green and blue due to padding; the difference in address between one MyRGB and the next can be described as the pixel stride in bytes.

An alternative representation is a “structure of arrays”, distinct from the “array of structures” myRGBarray:

typedef struct _MyRGBSoA {
unsigned char *red;
unsigned char *green;
unsigned char *blue;
} MyRGBSoA;

MyRGBSoA myRGBSoA;
myRGBSoA.red = (unsigned char *) malloc(100);
myRGBSoA.green = (unsigned char *) malloc(100);
myRGBSoA.blue = (unsigned char *) malloc(100);

In this case, accessing a value requires the sizeof each channel element. The best approach depends on the operations performed: calculations on one whole MyRGB a time likely favor MyRGB, those processing multiple values from a single channel may prefer MyRGBSoA. A “pixel” need not fill an entire byte — nor need pixel stride be a whole number of bytes. For example, a C++ std::vector<bool> can be considered to be a 1-D bulk data structure of individual bits.

### 2.2. Simple 2-D texel addressing

The simplest way to represent two-dimensional data is in consecutive rows, each representing a one-dimensional array — as with a 2-D array in C. There may be padding after each row to achieve the required alignment: in some cases each row should begin at the start of a cache line, or rows may be deliberately offset to different cache lines to ensure that vertically-adjacent values can be cached concurrently. The offset from the start of one horizontal row to the next is a line stride or row stride (or just stride for short), and is necessarily at least the width of the row. If each row holds an whole number of pixels, row stride can be described either in bytes or pixels; it is rare not to start each row on a byte boundary. In a simple 2-D representation, the row stride and the offset from the start of the storage can be described as follows:

\begin{align*} \textit{row stride}_\textit{pixels} &= \textit{width}_\textit{pixels} + \textit{padding}_\textit{pixels} \\ \textit{row stride}_\textit{bytes} &= \textit{width}_\textit{pixels} \times \textit{pixel stride}_\textit{bytes} + \textit{padding}_\textit{bytes} \\ \textit{offset}_\textit{pixels} &= x + (y \times \textit{rowstride}_\textit{pixels}) \\ \textit{address}_\textit{bytes} &= \textit{base} + (x \times \textit{pixel stride}_\textit{bytes}) + (y \times \textit{row stride}_\textit{bytes}) \end{align*}

Figure 5 shows example coordinate byte offsets for a 13×4 buffer, padding row stride to a multiple of four elements.

Table 1. Order of byte storage in memory for coordinates in a linear 5×3 buffer, padded (italics) to 8×3

 Byte 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Coords 0,0 1,0 2,0 3,0 4,0 5,0 6,0 7,0 0,1 1,1 2,1 3,1 4,1 5,1 6,1 7,1 0,2 1,2 2,2 3,2 4,2 5,2 6,2 7,2

Figure 6. 2D R,G,B byte offsets (padding in gray) from coordinates for a 4×4 image