Andrew Garrard

Khronos Data Format Specification License Information

Copyright (C) 2014-2017 The Khronos Group Inc. All Rights Reserved.

This specification is protected by copyright laws and contains material proprietary to the Khronos Group, Inc. It or any components may not be reproduced, republished, distributed, transmitted, displayed, broadcast, or otherwise exploited in any manner without the express prior written permission of Khronos Group. You may use this specification for implementing the functionality therein, without altering or removing any trademark, copyright or other notice from the specification, but the receipt or possession of this specification does not convey any rights to reproduce, disclose, or distribute its contents, or to manufacture, use, or sell anything that it may describe, in whole or in part.

This version of the Data Format Specification is published and copyrighted by Khronos, but is not a Khronos ratified specification. Accordingly, it does not fall within the scope of the Khronos IP policy, except to the extent that sections of it are normatively referenced in ratified Khronos specifications. Such references incorporate the referenced sections into the ratified specifications, and bring those sections into the scope of the policy for those specifications.

Khronos Group grants express permission to any current Promoter, Contributor or Adopter member of Khronos to copy and redistribute UNMODIFIED versions of this specification in any fashion, provided that NO CHARGE is made for the specification and the latest available update of the specification for any version of the API is used whenever possible. Such distributed specification may be reformatted AS LONG AS the contents of the specification are not changed in any way. The specification may be incorporated into a product that is sold as long as such product includes significant independent work developed by the seller. A link to the current version of this specification on the Khronos Group website should be included whenever possible with specification distributions.

Khronos Group makes no, and expressly disclaims any, representations or warranties, express or implied, regarding this specification, including, without limitation, any implied warranties of merchantability or fitness for a particular purpose or non-infringement of any intellectual property. Khronos Group makes no, and expressly disclaims any, warranties, express or implied, regarding the correctness, accuracy, completeness, timeliness, and reliability of the specification. Under no circumstances will the Khronos Group, or any of its Promoters, Contributors or Members or their respective partners, officers, directors, employees, agents, or representatives be liable for any damages, whether direct, indirect, special or consequential damages for lost revenues, lost profits, or otherwise, arising from or in connection with these materials.

Khronos, SYCL, SPIR, WebGL, EGL, COLLADA, StreamInput, OpenVX, OpenKCam, glTF, OpenKODE, OpenVG, OpenWF, OpenSL ES, OpenMAX, OpenMAX AL, OpenMAX IL and OpenMAX DL are trademarks and WebCL is a certification mark of the Khronos Group Inc. OpenCL is a trademark of Apple Inc. and OpenGL and OpenML are registered trademarks and the OpenGL ES and OpenGL SC logos are trademarks of Silicon Graphics International used under license by Khronos. All other product names, trademarks, and/or company names are used solely for identification and belong to their respective owners.

Revision History
Revision 0.1Jan 2015AG
Initial sharing
Revision 0.2Feb 2015AG
Added clarification, tables, examples
Revision 0.3Feb 2015AG
Further cleanup
Revision 0.4Apr 2015AG
Channel ordering standardised
Revision 0.5Apr 2015AG
Typos and clarification
Revision 1.0May 2015AG
Submission for 1.0 release
Revision 1.0 rev 2June 2015AG
Clarifications for 1.0 release
Revision 1.0 rev 3July 2015AG
Added KHR_DF_SAMPLE_DATATYPE_LINEAR
Revision 1.0 rev 4July 2015AG
Clarified KHR_DF_SAMPLE_DATATYPE_LINEAR
Revision 1.1November 2015AG
Added definitions of compressed texture formats
Revision 1.1 rev 2January 2016AG
Added definitions of floating point formats
Revision 1.1 rev 3February 2016AG
Fixed typo in sRGB conversion (thank you, Tom Grim!)
Revision 1.1 rev 4March 2016AG
Fixed typo/clarified sRGB in ASTC, typographical improvements
Revision 1.1 rev 5March 2016AG
Switch to official Khronos logo, removed scripts, restored title
Revision 1.1 rev 6June 2016AG
ASTC "block footprint" note, fixed credits/changelog/contents
Revision 1.1 rev 7September 2016AG
ASTC multi-point "part" and quint decode typo fixes
Revision 1.1 rev 8June 2017AG
ETC2 legibility and table typo fix
Revision 1.2 rev 0September 2017AG
Added color conversion formulae and extra options

Table of Contents

1. Introduction
2. Overview
3. Required concepts not in the “format”
4. Translation to API-specific representations
5. Data format descriptor
6. Descriptor block
7. Khronos Basic Data Format Descriptor Block
7.1. vendor_id
7.2. descriptor_type
7.3. version_number
7.4. descriptor_block_size
7.5. color_model
7.6. color_primaries
7.7. transfer_function
7.8. flags
7.9. texel_block_dimensions_[0..3]
7.10. bytes_plane_[0..7]
7.11. Sample information
8. Extension for more complex formats
9. Frequently Asked Questions
9.1. Why have a binary format rather than a human-readable one?
9.2. Why not use an existing representation such as those on FourCC.org?
9.3. Why have a descriptive format?
9.4. Why describe this standard within Khronos?
9.5. Why should I use this format if I don’t need most of the fields?
9.6. Why not expand each field out to be integer for ease of decoding?
9.7. Can this descriptor be used for text content?
10. S3TC Compressed Texture Image Formats
10.1. BC1 with no alpha
10.2. BC1 with alpha
10.3. BC2
10.4. BC3
11. RGTC Compressed Texture Image Formats
11.1. BC4 unsigned
11.2. BC4 signed
11.3. BC5 unsigned
11.4. BC5 signed
12. BPTC Compressed Texture Image Formats
12.1. BC7
12.2. BC6H
13. ETC1 Compressed Texture Image Formats
14. ETC2 Compressed Texture Image Formats
14.1. Format RGB ETC2
14.2. Format RGB ETC2 with sRGB encoding
14.3. Format RGBA ETC2
14.4. Format RGBA ETC2 with sRGB encoding
14.5. Format Unsigned R11 EAC
14.6. Format Unsigned RG11 EAC
14.7. Format Signed R11 EAC
14.8. Format Signed RG11 EAC
14.9. Format RGB ETC2 with punchthrough alpha
14.10. Format RGB ETC2 with punchthrough alpha and sRGB encoding
15. ASTC Compressed Texture Image Formats
15.1. What is ASTC?
15.2. Design Goals
15.3. Basic Concepts
15.4. Block Encoding
15.5. LDR and HDR Modes
15.6. Configuration Summary
15.7. Decode Procedure
15.8. Block Determination and Bit Rates
15.9. Block Layout
15.10. Block Mode
15.11. Color Endpoint Mode
15.12. Integer Sequence Encoding
15.13. Endpoint Unquantization
15.14. LDR Endpoint Decoding
15.15. HDR Endpoint Decoding
15.16. Weight Decoding
15.17. Weight Unquantization
15.18. Weight Infill
15.19. Weight Application
15.20. Dual-Plane Decoding
15.21. Partition Pattern Generation
15.22. Data Size Determination
15.23. Void-Extent Blocks
15.24. Illegal Encodings
15.25. LDR PROFILE SUPPORT
15.26. HDR PROFILE SUPPORT
16. Floating-point formats
16.1. 16-bit floating-point numbers
16.2. Unsigned 11-bit floating-point numbers
16.3. Unsigned 10-bit floating-point numbers
16.4. Non-standard floating point formats
16.5. The exponent
16.6. Special values
16.7. Conversion formulae
17. Introduction to color conversions
17.1. Color space composition
17.2. Operations in a color conversion
18. Transfer functions
18.1. About transfer functions (informative)
18.2. ITU transfer functions
18.3. sRGB transfer functions
18.4. BT.1886 transfer functions
18.5. BT.2100 HLG transfer functions
18.6. BT.2100 PQ transfer functions
18.7. DCI P3 transfer functions
18.8. Legacy NTSC transfer functions
18.9. Legacy PAL OETF
18.10. Legacy PAL 625-line EOTF
18.11. ST240/SMPTE240M transfer functions
18.12. Adobe RGB (1998) transfer functions
18.13. Sony S-Log transfer functions
18.14. Sony S-Log2 transfer functions
18.15. ACEScc transfer function
18.16. ACEScct transfer function
19. Color primaries
19.1. BT.709 color primaries
19.2. BT.601 625-line color primaries
19.3. BT.601 525-line color primaries
19.4. BT.2020 color primaries
19.5. NTSC 1953 color primaries
19.6. PAL 525-line analog color primaries
19.7. ACES color primaries
19.8. ACEScc color primaries
19.9. Display P3 color primaries
19.10. Adobe RGB (1998) color primaries
19.11. BT.709/BT.601 625-line primary conversion example
19.12. BT.709/BT.2020 primary conversion example
20. Color models
20.1. Y’CBCR color model
20.2. Y'CC'BCC'CR constant luminance color model
20.3. ICTCP constant intensity color model
21. Quantization schemes
21.1. “Narrow range” encoding
21.2. “Full range” encoding
21.3. Legacy “full range” encoding.
22. External references
22.1. IEEE754-2008 - IEEE standard for floating-point arithmetic
22.2. CIE Colorimetry - Part 3: CIE tristimulus values
22.3. ITU-R BT.601 Studio encoding parameters of digital television for standard 4:3 and wide-screen 16:9 aspect ratios
22.4. ITU-R BT.709 Parameter values for the HDTV standards for production and international programme exchange
22.5. ITU-R BT.2020 Parameter values for ultra-high definition television systems for production and international programme exchange
22.6. ITU-R BT.2100 Image parameter values for high dynamic range television for use in production and international programme exchange
22.7. JPEG File Interchange Format (JFIF)
22.8. ITU-R BT.1886: Reference electro-optical transfer function for flat panel displays used in HDTV studio production
22.9. ITU-R BT.2087 Colour conversion from Recommendation ITU-R BT.709 to Recommendation ITU-R BT.2020
22.10. ITU-R BT.2390-1 High dynamic range television for production and international programme exchange
22.11. ITU-R BT.470 Conventional analogue television systems
22.12. ITU-R BT.472-3: Video-frequency characteristics of a television system to be used for the international exchange of programmes between countries that have adopted 625-line colour or monochrome systems
22.13. ITU-R BT.1700: Characteristics of composite video signals for conventional analogue television systems
22.14. ITU-R BT.2043: Analogue television systems currently in use throughout the world
22.15. SMPTE 170m Composite analog video signal — NTSC for studio applications
22.16. FCC 73.682 - TV transmission standards
22.17. ST 240:1999 - SMPTE Standard - For Television — 1125-Line High-Definition Production Systems — Signal Parameters
22.18. IEC/4WD 61966-2-1: Colour measurement and management in multimedia systems and equipment - part 2-1: default RGB colour space - sRGB
22.19. IEC 61966-2-2:2003: Multimedia systems and equipment — Colour measurement and management — Part 2-2: Colour management — Extended RGB colour space — scRGB
22.20. DCI P3 color space
22.21. Academy Color Encoding System
22.22. Sony S-Log
22.23. Adobe RGB (1998)
23. Example format descriptors
24. Contributors

Abstract

This document describes a data format specification for non-opaque (user-visible) representations of user data to be used by, and shared between, Khronos standards. The intent of this specification is to avoid replication of incompatible format descriptions between standards and to provide a definitive mechanism for describing data that avoids excluding useful information that may be ignored by other standards. Other APIs are expected to map internal formats to this standard scheme, allowing formats to be shared and compared. This document also acts as a reference for the memory layout of a number of common compressed texture formats, and describes conversion between a number of common color spaces.

1. Introduction

Many APIs operate on bulk data — buffers, images, volumes, etc. — each composed of many elements with a fixed and often simple representation. Frequently, multiple alternative representations of data are supported: vertices can be represented with different numbers of dimensions, textures may have different bit depths and channel orders, and so on. Sometimes the representation of the data is highly specific to the application, but there are many types of data that are common to multiple APIs — and these can reasonably be described in a portable manner. In this standard, the term data format describes the representation of data.

It is typical for each API to define its own enumeration of the data formats on which it can operate. This causes a problem when multiple APIs are in use: the representations are likely to be incompatible, even where the capabilities intersect. When additional format-specific capabilities are added to an API which was designed without them, the description of the data representation often becomes inconsistent and disjoint. Concepts that are unimportant to the core design of an API may be represented simplistically or inaccurately, which can be a problem as the API is enhanced or when data is shared.

Some APIs do not have a strict definition of how to interpret their data. For example, a rendering API may treat all color channels of a texture identically, leaving the interpretation of each channel to the user’s choice of convention. This may be true even if color channels are given names that are associated with actual colors — in some APIs, nothing stops the user from storing the blue quantity in the red channel and the red quantity in the blue channel. Without enforcing a single data interpretation on such APIs, it is nonetheless often useful to offer a clear definition of the color interpretation convention that is in force, both for code maintenance and for communication with external APIs which do have a defined interpretation. Should the user wish to use an unconventional interpretation of the data, an appropriate descriptor can be defined that is specific to this choice, in order to simplify automated interpretation of the chosen representation and to provide concise documentation.

Where multiple APIs are in use, relying on an API-specific representation as an intermediary can cause loss of important information. For example, a camera API may associate color space information with a captured image, and a printer API may be able to operate with that color space, but if the data is passed through an intermediate compute API for processing and that API has no concept of a color space, the useful information may be discarded.

The intent of this standard is to provide a common, consistent, machine-readable way to describe those data formats which are amenable to non-proprietary representation. This standard provides a portable means of storing the most common descriptive information associated with data formats, and an extension mechanism that can be used when this common functionality must be supplemented.

While this standard is intended to support the description of many kinds of data, the most common class of bulk data used in Khronos standards represents color information. For this reason, the range of standard color representations used in Khronos standards is diverse, and a significant portion of this specification is devoted to color formats.

Later sections provide a description of the memory layout of a number of common texture compression formats, and describe some of the common color space conversions.

2. Overview

This document describes a standard layout for a data structure that can be used to define the representation of simple, portable, bulk data. Using such a data structure has the following benefits:

  • Ensuring a precise description of the portable data
  • Simplifying the writing of generic functionality that acts on many types of data
  • Offering portability of data between APIs

The “bulk data” may be, for example:

  • Pixel/texel data
  • Vertex data
  • A buffer of simple type

The layout of proprietary data structures is beyond the remit of this specification, but the large number of ways to describe colors, vertices and other repeated data makes standardization useful.

The data structure in this specification describes the elements in the bulk data, not the layout of the whole. For example, it may describe the size, location and interpretation of color channels within a pixel, but is not responsible for determining the mapping between spatial coordinates and the location of pixels in memory. That is, two textures which share the same pixel layout can share the same descriptor as defined in this specification, but may have different sizes, line strides, tiling or dimensionality. An example pixel format is described in Table 1, “A simple texel block — three channels representing one pixel”.

Table 1. A simple texel block — three channels representing one pixel

Red

Green

Blue


In some cases, the elements of bulk texture data may not correspond to a conventional texel. For example, in a compressed texture it is common for the atomic element of the buffer to represent a rectangular block of texels. Alternatively the representation of the output of a camera may have a repeating pattern according to a Bayer or other layout, as shown in Table 2, “A simple Bayer-pattern texel block — three channels spread across 2×2 pixels”. It is this repeating and self-contained atomic unit, termed a texel block, that is described by this standard.

Table 2. A simple Bayer-pattern texel block — three channels spread across 2×2 pixels

Red

Green

Green

Blue


The sampling or reconstruction of texel data is not a function of the data format. That is, a texture has the same format whether it is point sampled or a bicubic filter is used, and the manner of reconstructing full color data from a camera sensor is not defined. Where information making up the data format has a spatial aspect, this is part of the descriptor: it is part of the descriptor to define the spatial configuration of color samples in a Bayer sensor or whether the chroma difference channels in a Y’CBCR format are considered to be centered or co-sited, but not how this information must be used to generate coordinate-aligned full color values.

The data structure defined in this specification is termed a data format descriptor. This is an extensible block of contiguous memory, with a defined layout. The size of the data format descriptor depends on its content, but is also stored in a field at the start of the descriptor, making it possible to copy the data structure without needing to interpret all possible contents.

The data format descriptor is divided into one or more descriptor blocks, each also consisting of contiguous data, as shown in Table 3, “Data format descriptor and descriptor blocks”. These descriptor blocks may, themselves, be of different sizes, depending on the data contained within. The size of a descriptor block is stored as part of its data structure, allowing applications to process a data format descriptor while skipping contained descriptor blocks that it does not need to understand. The data format descriptor mechanism is extensible by the addition of new descriptor blocks.

Table 3. Data format descriptor and descriptor blocks

Data format descriptor

Descriptor block 1

Descriptor block 2

:


The diversity of possible data makes a concise description that can support every possible format impractical. This document describes one type of descriptor block, a basic descriptor block, that is expected to be the first descriptor block inside the data format descriptor where present, and which is sufficient for a large number of common formats, particularly for pixels. Formats which cannot be described within this scheme can use additional descriptor blocks of other types as necessary.

Later sections of this specification provide a description of the in-memory representation of a number of common compressed texture formats, and describe several common color spaces.

Glossary

Data format The interpretation of individual elements in bulk data. Examples include the channel ordering and bit positions in pixel data or the configuration of samples in a Bayer image. The format describes the elements, not the bulk data itself: an image’s size, stride, tiling, dimensionality, border control modes, and image reconstruction filter are not part of the format and are the responsibility of the application.

Data format descriptor A contiguous block of memory containing information about how data is represented, in accordance with this specification. A data format descriptor is a container, within which can be found one or more descriptor blocks. This specification does not define where or how the the data format descriptor should be stored, only its content. For example, the descriptor may be directly prepended to the bulk data, perhaps as part of a file format header, or the descriptor may be stored in a CPU memory while the bulk data that it describes resides within GPU memory; this choice is application-specific.

(Data format) descriptor block A contiguous block of memory with a defined layout, held within a data format descriptor. Each descriptor block has a common header that allows applications to identify and skip descriptor blocks that it does not understand, while continuing to process any other descriptor blocks that may be held in the data format descriptor.

Basic (data format) descriptor block The initial form of descriptor block as described in this standard. Where present, it must be the first descriptor block held in the data format descriptor. This descriptor block can describe a large number of common formats and may be the only type of descriptor block that many portable applications will need to support.

Texel block The units described by the Basic Data Format Descriptor: a repeating element within bulk data. In simple texture formats, a texel block may describe a single pixel. In formats with subsampled channels, the texel block may describe several pixels. In a block-based compressed texture, the texel block typically describes the compression block unit. The basic descriptor block supports texel blocks of up to four dimensions.

Sample In this standard, texel blocks are considered to be composed of contiguous bit patterns with a single channel or component type and a single spatial location. A typical ARGB pixel has four samples, one for each channel, held at the same coordinate. A texel block from a Bayer sensor might have a different location for different channels, and may have multiple samples representing the same channel at multiple locations. A Y’CBCR buffer with downsampled chroma may have more luma sample than chroma, each at different locations.

Plane In some formats, a texel block is not contiguous in memory. In a two-dimensional texture, the texel block may be spread across multiple scan lines, or channels may be stored independently. The basic format descriptor block defines a texel block as being made of a number of concatenated bits which may come from different regions of memory, where each region is considered a separate plane. For common formats, it is sufficient to require that the contribution from each plane is an integer number of bytes. This specification places no requirements on the ordering of planes in memory — the plane locations are described outside the format. This allows support for multiplanar formats which have proprietary padding requirements that are hard to accommodate in a more terse representation.

In many existing APIs, planes may be “downsampled” differently. For example, in these APIs, a Y’CBCR (colloquially “YUV”) 4:2:0 buffer as in Table 4, “Possible memory representation of a 4×4 Y’CBCR4:2:0 buffer (numbers are byte offsets)” would typically be represented with three planes (Table 5, “Plane descriptors for the above Y’CBCR format buffer in a conventional API”), one for each channel, with the luma (Y') plane containing four times as many pixels as the chroma (CB and CR) planes, and with two horizontal lines of the luma held within the same plane for each horizontal line of the chroma planes.

Table 4. Possible memory representation of a 4×4 Y’CBCR4:2:0 buffer (numbers are byte offsets)

Y' channel

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

CB channel

16

17

18

19

CR channel

20

21

22

23


Table 5. Plane descriptors for the above Y’CBCR format buffer in a conventional API

Y' plane

offset 0

byte stride 4

downsample 1×1

CB plane

offset 16

byte stride 2

downsample 2×2

CR plane

offset 20

byte stride 2

downsample 2×2


This approach does not extend logically to more complex formats such as a Bayer grid. Therefore in this specification, we would instead define the luma channel as in Table 6, “Plane descriptors for the above Y’CBCR format buffer using this standard”, using two planes, vertically interleaved (in a linear mapping between addresses and samples) by the selection of a suitable offset and line stride, with each line of luma samples contiguous in memory. Only one plane is used for each of the chroma channels (or one plane collectively if the chroma samples are stored adjacently).

Table 6. Plane descriptors for the above Y’CBCR format buffer using this standard

Y' plane 1

offset 0

byte stride 8

plane bytes 2

Y' plane 2

offset 4

byte stride 8

plane bytes 2

CB plane

offset 16

byte stride 2

plane bytes 1

CR plane

offset 20

byte stride 2

plane bytes 1


The same approach can be used to represent a static interlaced image, with a texel block consisting of two planes, one per field. This mechanism is all that is required to represent a static image without downsampled channels; however correct reconstruction of interlaced, downsampled color difference formats (such as Y’CBCR), which typically involves interpolation of the nearest chroma samples in a given field rather than the whole frame, is beyond the remit of this specification. There are many proprietary and often heuristic approaches to sample reconstruction, particularly for Bayer-like formats and for multi-frame images, and it is not practical to document them here.

There is no expectation that the internal format used by an API that wishes to make use of the Khronos Data Format Specification must use this specification’s representation internally: reconstructing downsampling information from this standard’s representation in order to revert to the more conventional representation should be trivial if required.

There is no requirement that the number of bytes occupied by the texel block be the same in each plane. The descriptor defines the number of bytes that the texel block occupies in each plane, which for most formats is sufficient to allow access to consecutive elements. For a two-dimensional data structure, it is up to the controlling interface to resolve byte stride between consecutive lines. For a three-dimensional structure, the controlling API may need to add a level stride. Since these strides are determined by the data size and architecture alignment requirements, they are not considered to be part of the format.

3. Required concepts not in the “format”

This specification encodes how atomic data should be interpreted in a manner which is independent of the layout and dimensionality of the collective data. Collections of data may have a “compatible format” in that their format descriptor may be identical, yet be different sizes. Some additional information is therefore expected to be recorded alongside the “format description”.

The API which controls the bulk data is responsible for controlling which memory location corresponds to the indexing mechanism chosen. A texel block has the concept of a coordinate offset within the block, which implies that if the data is accessed in terms of spatial coordinates, a texel block has spatial locality as well as referring to contiguous memory (per plane). For texel blocks which represent only a single spatial location, this is irrelevant; for block-based compression, for formats with downsampled channels, or for Bayer-like formats, the texel block represents a finite extent in up to four dimensions. However, the mapping from coordinate system to the memory location containing a texel block is beyond the control of this API.

The minimum requirements for accessing a linearly-addressed buffer is to store the start address and a stride (typically in bytes) between texels in each dimension of the buffer, for each plane contributing to the texel block. For the first dimension, the memory stride between texels may simply be the byte size of texel block in that plane — this implies that there are no gaps between texel blocks. For other dimensions, the stride is a function of the size of the data structure being represented — for example, in a compact representation of a two-dimensional buffer, the texel block at coordinate (x,y+1) might be found at the address of coordinate (x,y) plus the buffer width multiplied by the texel size in bytes. Similarly in a three-dimensional buffer, the address of the pixel at (x,y,z+1) may be at the address of (x,y,z) plus the byte size of a two-dimensional slice of the texture. In practice, even linear layouts may have padding, and often more complex relationships between coordinates and memory location are used to encourage locality of reference. The details of all of these data structures are beyond the remit of this specification.

Most simple formats contain a single plane of data. Those formats which require additional planes compared with a conventional representation are typically downsampled Y’CBCR formats, which already have the concept of separate storage for different color channels. While this specification uses multiple planes to describe texel blocks that span multiple scan lines if the data is disjoint, there is no expectation that the API using the data formats needs to maintain this representation — interleaved planes should be easy to identify and coalesce if the API requires a more conventional representation of downsampled formats.

Some image representations are composed of tiles of texels which are held contiguously in memory, with the texels within the tile stored in some order that improves locality of reference for multi-dimensional access. This is a common approach to improve memory efficiency when texturing. While it is possible to represent such a tile as a large texel block (up to the maximum representable texel block size in this specification), this is unlikely to be an efficient approach, since a large number of samples will be needed and the layout of a tile usually has a very limited number of possibilities. In most cases, the layout of texels within the tile should be described by whatever interface is aware of image-specific information such as size and stride, and only the format of the texels should be described by a format descriptor.

The complication to this is where texel blocks larger than a single pixel are themselves encoded using proprietary tiling. The spatial layout of samples within a texel block is required to be fixed in the basic format descriptor — for example, if the texel block size is 2×2 pixels, the top left pixel might always be expected to be in the first byte in that texel block. In some proprietary memory tiling formats, such as ones that store small rectangular blocks in raster order in consecutive bytes or in Morton order, this relationship may be preserved, and the only proprietary operation is finding the start of the texel block. In other proprietary layouts such as Hilbert curve order, or when the texel block size does not divide the tiling size, a direct representation of memory may be impossible. In these cases, it is likely that this data format standard would be used to describe the data as it would be seen in a linear format, and the mapping from coordinates to memory would have to be hidden in proprietary translation. As a logical format description, this is unlikely to be critical, since any software which accesses such a layout will necessarily need proprietary knowledge anyway.

4. Translation to API-specific representations

The data format container described here is too unwieldy to be expected to be used directly in most APIs. The expectation is that APIs and users will define data descriptors in memory, but have API-specific names for the formats that the API supports. If these names are enumeration values, a mapping can be provided by having an array of pointers to the data descriptors, indexed by the enumeration. It may commonly be necessary to provide API-specific supplementary information in the same array structure, particularly where the API natively associates concepts with the data which is not uniquely associated with the content.

In this approach, it is likely that an API would predefine a number of common data formats which are natively supported. If there is a desire to support dynamic creation of data formats, this array could be made extensible with a manager returning handles.

Even where an API supports only a fixed set of formats, it is flexible to provide a comparison with user-provided format descriptors in order to establish whether a format is compatible.

5. Data format descriptor

The layout of the data structures described here are assumed to be little-endian for the purposes of data transfer, but may be implemented in the natural endianness of the platform for internal use.

The data format descriptor consists of a contiguous area of memory, as shown in Table 7, “Data Format Descriptor layout”, divided into one or more descriptor blocks, as in Table 8, “Data format descriptor header and descriptor blocks” which are tagged by the type of descriptor that they contain. The size of the data format descriptor varies according to its content.

Table 7. Data Format Descriptor layout

uint32_t

total_size

Descriptor block

first descriptor

Descriptor block

second descriptor (optional) etc.


The total_size field, measured in bytes, allows the full format descriptor to be copied without need for details of the descriptor to be interpreted.

Table 8. Data format descriptor header and descriptor blocks

total_size

Descriptor block 1

Descriptor block 2

:


6. Descriptor block

Each Descriptor Block has the same prefix, shown in Table 9, “Descriptor Block layout”.

Table 9. Descriptor Block layout

uint32_t

vendor_id | (descriptor_type << 16)

uint32_t

version_number | (descriptor_block_size << 16)

Format-specific data


The vendor_id is a 16-bit value uniquely assigned to organizations, allocated by Khronos; ID 0 is used to identify Khronos itself. The ID 0xFFFF is reserved for internal use which is guaranteed not to clash with third-party implementations; this ID should not be shipped in libraries to avoid conflicts with development code.

The descriptor_type is a unique identifier defined by the vendor to distinguish between potential data representations.

The version_number is vendor-defined, and intended to allow for backwards-compatible updates to existing descriptor blocks.

The descriptor_block_size indicates the size in bytes of this Descriptor Block, remembering that there may be multiple Descriptor Blocks within one container, as shown in Table 10, “Data format descriptor header and descriptor block headers”. The descriptor_block_size therefore gives the offset between the start of the current Descriptor Block and the start of the next — so the size includes the vendor_id, descriptor_type, version_number and descriptor_block_size fields, which collectively contribute 8 bytes.

Having an explicit descriptor_block_size allows implementations to skip a descriptor block whose format is unknown, allowing known data to be interpreted and unknown information to be ignored. Some descriptor block types may not be of a uniform size, and may vary according to the content within.

This specification initially describes only one type of descriptor block. Future revisions may define additional descriptor block types for additional applications — for example, to describe data with a large number of channels or pixels described in an arbitrary color space. Vendors can also implement proprietary descriptor blocks to hold vendor-specific information within the standard Descriptor.

Table 10. Data format descriptor header and descriptor block headers

total_size

vendor_id | (descriptor_type << 16)

version_number | (descriptor_block_size << 16)

:

vendor_id | (descriptor_type << 16)

version_number | (descriptor_block_size << 16)

:


7. Khronos Basic Data Format Descriptor Block

One basic descriptor block, shown in Table 11, “Basic Data Format Descriptor layout” is intended to cover a large amount of metadata that is typically associated with common bulk data — most notably image or texture data. While this descriptor contains more information about the data interpretation than is needed by many applications, having a relatively comprehensive descriptor reduces the risk that metadata needed by different APIs will be lost in translation.

The format is described in terms of a repeating axis-aligned texel block composed of of samples. Each sample contains a single channel of information with a single spatial offset within the texel block, and consists of an amount of contiguous data. This descriptor block consists of information about the interpretation of the texel block as a whole, supplemented by a description of a number of samples taken from one or more planes of contiguous memory. For example, a 24-bit red/green/blue format may be described as a 1×1 pixel region, containing three samples, one of each color, in one plane. A Y’CBCR 4:2:0 format may consist of a repeating 2×2 region consisting of four Y' samples and one sample each of CB and CR.

Table 11. Basic Data Format Descriptor layout

Byte 0 (LSB) Byte 1 Byte 2 Byte 3 (MSB)

0 (vendor_id)

0 (descriptor_type)

0 (version number)

24 + 16 times number of samples (descriptor_block_size)

color_model

color_primaries

transfer_function

flags

texel_block_dimension_0

texel_block_dimension_1

texel_block_dimension_2

texel_block_dimension_3

bytes_plane_0

bytes_plane_1

bytes_plane_2

bytes_plane_3

bytes_plane_4

bytes_plane_5

bytes_plane_6

bytes_plane_7

Sample information for first sample

Sample information for second sample (optional), etc.


The fields of the Basic Data Format Descriptor Block are described in the following sections.

7.1. vendor_id

The vendor_id for the Basic Data Format Descriptor Block is 0, defined as KHR_DF_VENDORID_KHRONOS in khr_df_vendorid_e.

7.2. descriptor_type

The descriptor_type for the Basic Data Format Descriptor Block is 0, a value reserved in khr_df_khr_descriptortype_e (the enumeration of Khronos-specific descriptor types) as KHR_DF_KHR_DESCRIPTORTYPE_BASICFORMAT.

7.3. version_number

The version_number relating to the Basic Data Format Descriptor Block as described in this specification is 1.

7.4. descriptor_block_size

The size of the Basic Data Format Descriptor Block depends on the number of samples contained within it. The memory requirements for this format are 24 bytes of shared data plus 16 bytes per sample. The descriptor_block_size is measured in bytes.

7.5. color_model

The color_model determines the set of color (or other data) channels which may be encoded within the data, though there is no requirement that all of the possible channels from the color_model be present. Most data fits into a small number of common color models, but compressed texture formats each have their own color model enumeration. Note that the data need not actually represent a color — this is just the most common type of content using this descriptor. Some standards use color container for this concept.

The available color models are described in the khr_df_model_e enumeration, and are represented as an unsigned 8-bit value.

Note that the numbering of the component channels is chosen such that those channel types which are common across multiple color models have the same enumeration value. That is, alpha is always encoded as channel ID 15, depth is always encoded as channel ID 14, and stencil is always encoded as channel ID 13. Luma/Luminance is always in channel ID 0. This numbering convention is intended to simplify code which can process a range of color models. Note that there is no guarantee that models which do not support these channels will not use this channel ID. Particularly, RGB formats do not have luma in channel 0, and a 16-channel undefined format is not obligated to represent alpha in any way in channel number 15.

The value of each enumerant is shown in parentheses following the enumerant name.

KHR_DF_MODEL_UNSPECIFIED (= 0)

When the data format is unknown or does not fall into a predefined category, utilities which perform automatic conversion based on an interpretation of the data cannot operate on it. This format should be used when there is no expectation of portable interpretation of the data using only the basic descriptor block.

For portability reasons, it is recommended that pixel-like formats with up to sixteen channels, but which cannot have those channels described in the basic block, be represented with a basic descriptor block with the appropriate number of samples from UNSPECIFIED channels, and then for the channel description to be stored in an extension block. This allows software which understands only the basic descriptor to be able to perform operations that depend only on channel location, not channel interpretation (such as image cropping). For example, a camera may store a raw format taken with a modified Bayer sensor, with “RGBW” (red, green, blue and white) sensor sites, or “RGBE” (red, green, blue and “emerald”). Rather than trying to encode the exact color coordinates of each sample in the basic descriptor, these formats could be represented by a four-channel “UNSPECIFIED” model, with an extension block describing the interpretation of each channel.

KHR_DF_MODEL_RGBSDA (= 1)

This color model represents additive colors of three channels, nominally red, green and blue, supplemented by channels for alpha, depth and stencil, as shown in Table 12, “Basic Data Format RGBSDA channels”. Note that in many formats, depth and stencil are stored in a completely independent buffer, but there are formats for which integrating depth and stencil with color data makes sense.

Table 12. Basic Data Format RGBSDA channels

Channel number Name Description

0

KHR_DF_CHANNEL_RGBSDA_RED

Red

1

KHR_DF_CHANNEL_RGBSDA_GREEN

Green

2

KHR_DF_CHANNEL_RGBSDA_BLUE

Blue

13

KHR_DF_CHANNEL_RGBSDA_STENCIL

Stencil

14

KHR_DF_CHANNEL_RGBSDA_DEPTH

Depth

15

KHR_DF_CHANNEL_RGBSDA_ALPHA

Alpha (transparency)


Portable representation of additive colors with more than three primaries requires an extension to describe the full color space of the channels present. There is no practical way to do this portably without taking significantly more space.

KHR_DF_MODEL_YUVSDA (= 2)

This color model represents color differences with three channels, nominally luma (Y') and two color-difference chroma channels, U (CB) and V (CR), supplemented by channels for alpha, depth and stencil, as shown in Table 13, “Basic Data Format YUVSDA channels”. These formats are distinguished by CB and CR being a delta between the Y' channel and the blue and red channels respectively, rather than requiring a full color matrix. The conversion between Y’CBCR and RGB color spaces is defined in this case by the choice of value in the color_primaries field as described in Section 20.1, “Y’CBCR color model”.

[Note]

Most single-channel luma/luminance monochrome data formats should select KHR_DF_MODEL_YUVSDA and use only the Y channel, unless there is a reason to do otherwise.

Table 13. Basic Data Format YUVSDA channels

Channel number Name Description

0

KHR_DF_CHANNEL_YUVSDA_Y

Y/Y' (luma/luminance)

1

KHR_DF_CHANNEL_YUVSDA_CB

CB (alias for U)

1

KHR_DF_CHANNEL_YUVSDA_U

U (alias for CB)

2

KHR_DF_CHANNEL_YUVSDA_CR

CR (alias for V)

2

KHR_DF_CHANNEL_YUVSDA_V

V (alias for CR)

13

KHR_DF_CHANNEL_YUVSDA_STENCIL

Stencil

14

KHR_DF_CHANNEL_YUVSDA_DEPTH

Depth

15

KHR_DF_CHANNEL_YUVSDA_ALPHA

Alpha (transparency)


[Note]

Terminology for this color model is often abused. This model is based on the idea of creating a representation of monochrome light intensity as a weighted average of color channels, then calculating color differences by subtracting two of the color channels from this monochrome value. Proper names vary for each variant of the ensuing numbers, but “YUV” is colloquially used for all of them. In the television standards from which this terminology is derived, Y’CBCR is more formally used to describe the representation of these color differences. See Section 20.1, “Y’CBCR color model” for more detail.

KHR_DF_MODEL_YIQSDA (= 3)

This color model represents color differences with three channels, nominally luma (Y) and two color-difference chroma channels, I and Q, supplemented by channels for alpha, depth and stencil, as shown in Table 14, “Basic Data Format YIQSDA channels”. This format is distinguished by I and Q each requiring all three additive channels to evaluate. I and Q are derived from CB and CR by a 33-degree rotation.

Table 14. Basic Data Format YIQSDA channels

Channel number Name Description

0

KHR_DF_CHANNEL_YIQSDA_Y

Y' (luma)

1

KHR_DF_CHANNEL_YIQSDA_I

I (in-phase)

2

KHR_DF_CHANNEL_YIQSDA_Q

Q (quadrature)

13

KHR_DF_CHANNEL_YIQSDA_STENCIL

Stencil

14

KHR_DF_CHANNEL_YIQSDA_DEPTH

Depth

15

KHR_DF_CHANNEL_YIQSDA_ALPHA

Alpha (transparency)


KHR_DF_MODEL_LABSDA (= 4)

This color model represents the ICC perceptually-uniform L*a*b* color space, combined with the option of an alpha channel, as shown in Table 15, “Basic Data Format LABSDA channels”.

Table 15. Basic Data Format LABSDA channels

Channel number Name Description

0

KHR_DF_CHANNEL_LABSDA_L

L* (luma)

1

KHR_DF_CHANNEL_LABSDA_A

a*

2

KHR_DF_CHANNEL_LABSDA_B

b*

13

KHR_DF_CHANNEL_LABSDA_STENCIL

Stencil

14

KHR_DF_CHANNEL_LABSDA_DEPTH

Depth

15

KHR_DF_CHANNEL_LABSDA_ALPHA

Alpha (transparency)


KHR_DF_MODEL_CMYKA (= 5)

This color model represents secondary (subtractive) colors and the combined key (black) channel, along with alpha, as shown in Table 16, “Basic Data Format CMYKA channels”.

Table 16. Basic Data Format CMYKA channels

Channel number Name Description

0

KHR_DF_CHANNEL_CMYKA_CYAN

Cyan

1

KHR_DF_CHANNEL_CMYKA_MAGENTA

Magenta

2

KHR_DF_CHANNEL_CMYKA_YELLOW

Yellow

3

KHR_DF_CHANNEL_CMYKA_KEY

Key/Black

15

KHR_DF_CHANNEL_CMYKA_ALPHA

Alpha (transparency)


KHR_DF_MODEL_XYZW (= 6)

This “color model” represents channel data used for coordinate values, as shown in Table 17, “Basic Data Format XYZW channels” — for example, as a representation of the surface normal in a bump map. Additional channels for higher-dimensional coordinates can be used by extending the channel number within the 4-bit limit of the channel_type field.

Table 17. Basic Data Format XYZW channels

Channel number Name Description

0

KHR_DF_CHANNEL_XYZW_X

X

1

KHR_DF_CHANNEL_XYZW_Y

Y

2

KHR_DF_CHANNEL_XYZW_Z

Z

3

KHR_DF_CHANNEL_XYZW_W

W


KHR_DF_MODEL_HSVA_ANG (= 7)

This color model represents color differences with three channels, value (luminance or luma), saturation (distance from monochrome) and hue (dominant wavelength), supplemented by an alpha channel, as shown in Table 18, “Basic Data Format HSVA_ANG channels”. In this model, the hue relates to the angular offset on a color wheel.

Table 18. Basic Data Format HSVA_ANG channels

Channel number Name Description

0

KHR_DF_CHANNEL_HSVA_ANG_VALUE

V (value)

1

KHR_DF_CHANNEL_HSVA_ANG_SATURATION

S (saturation)

2

KHR_DF_CHANNEL_HSVA_ANG_HUE

H (hue)

15

KHR_DF_CHANNEL_HSVA_ANG_ALPHA

Alpha (transparency)


KHR_DF_MODEL_HSLA_ANG (= 8)

This color model represents color differences with three channels, lightness (maximum intensity), saturation (distance from monochrome) and hue (dominant wavelength), supplemented by an alpha channel, as shown in Table 19, “Basic Data Format HSLA_ANG channels”. In this model, the hue relates to the angular offset on a color wheel.

Table 19. Basic Data Format HSLA_ANG channels

Channel number Name Description

0

KHR_DF_CHANNEL_HSLA_ANG_LIGHTNESS

L (lightness)

1

KHR_DF_CHANNEL_HSLA_ANG_SATURATION

S (saturation)

2

KHR_DF_CHANNEL_HSLA_ANG_HUE

H (hue)

15

KHR_DF_CHANNEL_HSLA_ANG_ALPHA

Alpha (transparency)


KHR_DF_MODEL_HSVA_HEX (= 9)

This color model represents color differences with three channels, value (luminance or luma), saturation (distance from monochrome) and hue (dominant wavelength), supplemented by an alpha channel, as shown in Table 20, “Basic Data Format HSVA_HEX channels”. In this model, the hue is generated by interpolation between extremes on a color hexagon.

Table 20. Basic Data Format HSVA_HEX channels

Channel number Name Description

0

KHR_DF_CHANNEL_HSVA_HEX_VALUE

V (value)

1

KHR_DF_CHANNEL_HSVA_HEX_SATURATION

S (saturation)

2

KHR_DF_CHANNEL_HSVA_HEX_HUE

H (hue)

15

KHR_DF_CHANNEL_HSVA_HEX_ALPHA

Alpha (transparency)


KHR_DF_MODEL_HSLA_HEX (= 10)

This color model represents color differences with three channels, lightness (maximum intensity), saturation (distance from monochrome) and hue (dominant wavelength), supplemented by an alpha channel, as shown in Table 21, “Basic Data Format HSLA_HEX channels”. In this model, the hue is generated by interpolation between extremes on a color hexagon.

Table 21. Basic Data Format HSLA_HEX channels

Channel number Name Description

0

KHR_DF_CHANNEL_HSLA_HEX_LIGHTNESS

L (lightness)

1

KHR_DF_CHANNEL_HSLA_HEX_SATURATION

S (saturation)

2

KHR_DF_CHANNEL_HSLA_HEX_HUE

H (hue)

15

KHR_DF_CHANNEL_HSLA_HEX_ALPHA

Alpha (transparency)


KHR_DF_MODEL_YCGCOA (= 11)

This color model represents low-cost approximate color differences with three channels, nominally luma (Y) and two color-difference chroma channels, Cg (green/purple color difference) and Co (orange/cyan color difference), supplemented by a channel for alpha, as shown in Table 22, “Basic Data Format YCoCgA channels”.

Table 22. Basic Data Format YCoCgA channels

Channel number Name Description

0

KHR_DF_CHANNEL_YCGCOA_Y

Y

1

KHR_DF_CHANNEL_YCGCOA_CG

Cg

2

KHR_DF_CHANNEL_YCGCOA_CO

Co

15

KHR_DF_CHANNEL_YCGCOA_ALPHA

Alpha (transparency)


KHR_DF_MODEL_YCCBCCRC (= 12)

This color model represents the “Constant luminance” $Y'_CC'_{BC}C'_{RC}$ color model defined as an optional representation in ITU-T BT.2020 and described in Section 20.2, “Y'CC'BCC'CR constant luminance color model”.

Table 23. Basic Data Format YCCBCCRC channels

Channel number Name Description

0

KHR_DF_CHANNEL_YCCBCCRC_YC

YC (luminance)

1

KHR_DF_CHANNEL_YCCBCCRC_CBC

CBC

2

KHR_DF_CHANNEL_YCCBCCRC_CRC

CRC

13

KHR_DF_CHANNEL_YCCBCCRC_STENCIL

Stencil

14

KHR_DF_CHANNEL_YCCBCCRC_DEPTH

Depth

15

KHR_DF_CHANNEL_YCCBCCRC_ALPHA

Alpha (transparency)


KHR_DF_MODEL_ICTCP (= 13)

This color model represents the “Constant intensity $IC_TC_P$ color model” defined as an optional representation in ITU-T BT.2100 and described in Section 20.3, “ICTCP constant intensity color model”.

Table 24. Basic Data Format ICTCP channels

Channel number Name Description

0

KHR_DF_CHANNEL_ICTCP_I

I (intensity)

1

KHR_DF_CHANNEL_ICTCP_CT

CT

2

KHR_DF_CHANNEL_ICTCP_CP

CP

13

KHR_DF_CHANNEL_ICTCP_STENCIL

Stencil

14

KHR_DF_CHANNEL_ICTCP_DEPTH

Depth

15

KHR_DF_CHANNEL_ICTCP_ALPHA

Alpha (transparency)


KHR_DF_MODEL_CIEXYZ (= 14)

This color model represents channel data used to describe color coordinates in the CIE 1931 XYZ coordinate space, as shown in Table 25, “Basic Data Format CIE XYZ channels”.

Table 25. Basic Data Format CIE XYZ channels

Channel number Name Description

0

KHR_DF_CHANNEL_CIEXYZ_X

X

1

KHR_DF_CHANNEL_CIEXYZ_Y

Y

2

KHR_DF_CHANNEL_CIEXYZ_Z

Z


KHR_DF_MODEL_CIEXYY (= 15)

This color model represents channel data used to describe chromaticity coordinates in the CIE 1931 xyY coordinate space, as shown in Table 26, “Basic Data Format CIE xyY channels”.

Table 26. Basic Data Format CIE xyY channels

Channel number Name Description

0

KHR_DF_CHANNEL_CIEXYZ_X

x

1

KHR_DF_CHANNEL_CIEXYZ_YCHROMA

y

2

KHR_DF_CHANNEL_CIEXYZ_YLUMA

Y


Compressed formats

A number of compressed formats are supported as part of khr_df_model_e. In general, these formats will have a texel block dimension of the compression block size. Most contain a single sample of channel type 0 at offset 0,0 — where further samples are required, they should also be sited at 0,0. By convention, models which have multiple channels that are disjoint in memory have these channel locations described accurately.

The ASTC family of formats have a number of possible channels, and are distinguished by samples which reference some set of these channels. The texel_block_dimensions field determines the compression ratio for ASTC.

Floating-point compressed formats have lower and upper limits specified in floating point format. Integer compressed formats with a lower and upper of 0 and UINT32_MAX (for unsigned formats) or INT32_MIN and INT32_MAX (for signed formats) are assumed to map the full representable range to the 0..1 or -1..1 respectively.

KHR_DF_MODEL_DXT1A/KHR_DF_MODEL_BC1A (= 128)

This model represents the DXT1 or BC1 format. Channel 0 indicates color. If a second sample is present it should use channel 1 to indicate that the “special value” of the format should represent transparency — otherwise the “special value” represents opaque black.

KHR_DF_MODEL_DXT2/3/KHR_DF_MODEL_BC2 (= 129)

This model represents the DXT2/3 format, also described as BC2. The alpha premultiplication state (the distinction between DXT2 and DXT3) is recorded separately in the descriptor. This model has two channels: ID 0 contains the color information and ID 15 contains the alpha information. The alpha channel is 64 bits and at offset 0; the color channel is 64 bits and at offset 64. No attempt is made to describe the 16 alpha samples for this position independently, since understanding the other channels for any pixel requires the whole texel block.

KHR_DF_MODEL_DXT4/5/KHR_DF_MODEL_BC3 (= 130)

This model represents the DXT4/5 format, also described as BC3. The alpha premultiplication state (the distinction between DXT4 and DXT5) is recorded separately in the descriptor. This model has two channels: ID 0 contains the color information and ID 15 contains the alpha information. The alpha channel is 64 bits and at offset 0; the color channel is 64 bits and at offset 64.

KHR_DF_MODEL_BC4 (= 131)

This model represents the Direct3D BC4 format for single-channel interpolated 8-bit data. The model has a single channel of id 0 with offset 0 and length 64 bits.

KHR_DF_MODEL_BC5 (= 132)

This model represents the Direct3D BC5 format for dual-channel interpolated 8-bit data. The model has two channels, 0 (red) and 1 (green), which should have their bit depths and offsets independently described: the red channel has offset 0 and length 64 bits and the green channel has offset 64 and length 64 bits.

KHR_DF_MODEL_BC6H (= 133)

This model represents the Direct3D BC6H format for RGB floating-point data. The model has a single channel 0, representing all three channels, and occupying 128 bits.

KHR_DF_MODEL_BC7 (= 134)

This model represents the Direct3D BC7 format for RGBA data. This model has a single channel 0 of 128 bits.

KHR_DF_MODEL_ETC1 (= 160)

This model represents the original Ericsson Texture Compression format, with a guarantee that the format does not rely on ETC2 extensions. It contains a single channel of RGB data.

KHR_DF_MODEL_ETC2 (= 161)

This model represents the updated Ericsson Texture Compression format, ETC2. Channel 0 represents red, and is used for the R11 EAC format. Channel 1 represents green, and both red and green should be present for the RG11 EAC format. Channel 2 represents RGB combined content. Channel 15 indicates the presence of alpha. If the texel block size is 8 bytes and the RGB and alpha channels are co-sited, “punch through” alpha is supported. If the texel block size is 16 bytes and the alpha channel appears in the first 8 bytes, followed by 8 bytes for the RGB channel, 8-bit separate alpha is supported.

KHR_DF_MODEL_ASTC (= 162)

This model represents Adaptive Scalable Texture Compression as a single channel in a texel block of 16 bytes. ASTC HDR (high dynamic range) and LDR (low dynamic range) modes are distinguished by the channel_id containing the flag KHR_DF_SAMPLE_DATATYPE_FLOAT: an ASTC texture that is guaranteed by the user to contain only LDR-encoded blocks should have the channel_id KHR_DF_SAMPLE_DATATYPE_FLOAT bit clear, and an ASTC texture that may include HDR-encoded blocks should have the channel_id KHR_DF_SAMPLE_DATATYPE_FLOAT bit set to 1. ASTC supports a number of compression ratios defined by different texel block sizes; these are selected by changing the texel block size fields in the data format. The single sample has a size of 128 bits.

ASTC encoding is described in section Section 15, “ASTC Compressed Texture Image Formats”.

7.6. color_primaries

It is not sufficient to define a buffer as containing, for example, additive primaries. Additional information is required to define what “red” is provided by the “red” channel. A full definition of primaries requires an extension which provides the full color space of the data, but a subset of common primary spaces can be identified by the khr_df_primaries_e enumeration, represented as an unsigned 8-bit integer value.

More information about color primaries is provided in Section 19, “Color primaries”.

KHR_DF_PRIMARIES_UNSPECIFIED (= 0)

This “set of primaries” identifies a data representation whose color representation is unknown or which does not fit into this list of common primaries. Having an “unspecified” value here precludes users of this data format from being able to perform automatic color conversion unless the primaries are defined in another way. Formats which require a proprietary color space — for example, raw data from a Bayer sensor that records the direct response of each filtered sample — can still indicate that samples represent “red”, “green” and “blue”, but should mark the primaries here as “unspecified” and provide a detailed description in an extension block.

KHR_DF_PRIMARIES_BT709 (= 1)

This value represents the Color Primaries defined by the ITU-R BT.709 specification and described in Section 19.1, “BT.709 color primaries”, which are also shared by sRGB.

RGB data is distinguished between BT.709 and sRGB by the Transfer Function. Conversion to and from BT.709 Y’CBCR (“YUV”) representation uses the color conversion matrix defined in the BT.709 specification, and described in the section called “BT.709 Y’CBCR conversion”, except in the case of sYCC (which can be distinguished by the use of the sRGB transfer function), in which case conversion to and from BT.709 Y’CBCR representation uses the color conversion matrix defined in the BT.601 specification, and described in the section called “BT.601 Y’CBCR conversion”. This is the preferred set of color primaries used by HDTV and sRGB, and likely a sensible default set of color primaries for common rendering operations.

KHR_DF_PRIMARIES_SRGB is provided as a synonym for KHR_DF_PRIMARIES_BT709.

KHR_DF_PRIMARIES_BT601_EBU (= 2)

This value represents the Color Primaries defined in the ITU-R BT.601 specification for standard-definition television, particularly for 625-line signals, and described in Section 19.2, “BT.601 625-line color primaries”. Conversion to and from BT.601 Y’CBCR (“YUV”) typically uses the color conversion matrix defined in the BT.601 specification and described in the section called “BT.601 Y’CBCR conversion”.

KHR_DF_PRIMARIES_BT601_SMPTE (= 3)

This value represents the Color Primaries defined in the ITU-R BT.601 specification for standard-definition television, particularly for 525-line signals, and described in Section 19.3, “BT.601 525-line color primaries”. Conversion to and from BT.601 Y’CBCR (“YUV”) typically uses the color conversion matrix defined in the BT.601 specification and described in the section called “BT.601 Y’CBCR conversion”.

KHR_DF_PRIMARIES_BT2020 (= 4)

This value represents the Color Primaries defined in the ITU-R BT.2020 specification for ultra-high-definition television and described in Section 19.4, “BT.2020 color primaries”. Conversion to and from BT.2020 Y’CBCR (“YUV”) uses the color conversion matrix defined in the BT.2020 specification and described in the section called “BT.2020 Y’CBCR conversion”.

KHR_DF_PRIMARIES_CIEXYZ (= 5)

This value represents the theoretical Color Primaries defined by the International Color Consortium for the ICC XYZ linear color space.

KHR_DF_PRIMARIES_ACES (= 6)

This value represents the Color Primaries defined for the Academy Color Encoding System and described in Section 19.7, “ACES color primaries”.

KHR_DF_PRIMARIES_ACESCC (= 7)

This value represents the Color Primaries defined for the Academy Color Encoding System compositor and described in Section 19.8, “ACEScc color primaries”.

KHR_DF_PRIMARIES_NTSC1953 (= 8)

This value represents the Color Primaries defined for the NTSC 1953 color television transmission standard and described in Section 19.5, “NTSC 1953 color primaries”.

KHR_DF_PRIMARIES_PAL525 (= 9)

This value represents the Color Primaries defined for 525-line PAL signals, described in Section 19.6, “PAL 525-line analog color primaries”.

KHR_DF_PRIMARIES_DISPLAYP3 (= 10)

This value represents the Color Primaries defined for the Display P3 color space, described in Section 19.9, “Display P3 color primaries”.

KHR_DF_PRIMARIES_ADOBERGB (= 11)

This value represents the Color Primaries defined in Adobe RGB (1998), described in Section 19.10, “Adobe RGB (1998) color primaries”.

7.7. transfer_function

Many color representations contain a non-linear transfer function which maps between a linear (intensity-based) representation and a more perceptually-uniform encoding. Common transfer functions are encoded in the khr_df_transfer_e enumeration and represented as an unsigned 8-bit integer. A fully-flexible transfer function requires an extension with a full color space definition. Where the transfer function can be described as a simple power curve, applying the function is commonly known as “gamma correction”. The transfer function is applied to a sample only when the sample’s KHR_DF_SAMPLE_DATATYPE_LINEAR bit is 0; if this bit is 1, the sample is represented linearly irrespective of the transfer_function.

When a color model contains more than one channel in a sample and the transfer function should be applied only to a subset of those channels, the convention of that model should be used when applying the transfer function. For example, ASTC stores both alpha and RGB data but is represented by a single sample; in ASTC, any sRGB transfer function is not applied to the alpha channel of the ASTC texture. In this case, the KHR_DF_SAMPLE_DATATYPE_LINEAR bit being zero means that the transfer function is “applied” to the ASTC sample in a way that only affects the RGB channels. This is not a concern for most color models, which explicitly store different channels in each sample.

If all the samples are linear, KHR_DF_TRANSFER_LINEAR should be used. In this case, no sample should have the KHR_DF_SAMPLE_DATATYPE_LINEAR bit set.

The enumerant value for each of the following transfer functions is shown in parentheses alongside the title.

KHR_DF_TRANSFER_UNSPECIFIED (= 0)

This value should be used when the transfer function is unknown, or specified only in an extension block, precluding conversion of color spaces and correct filtering of the data values using only the information in the basic descriptor block.

KHR_DF_TRANSFER_LINEAR (= 1)

This value represents a linear transfer function: for color data, there is a linear relationship between numerical pixel values and the intensity of additive colors. This transfer function allows for blending and filtering operations to be applied directly to the data values.

KHR_DF_TRANSFER_SRGB (= 2)

This value represents the non-linear transfer function defined in the sRGB specification for mapping between numerical pixel values and intensity. This is described in Section 18.3, “sRGB transfer functions”.

KHR_DF_TRANSFER_ITU (= 3)

This value represents the non-linear transfer function defined by the ITU and used in the BT.601, BT.709 and BT.2020 specifications. This is described in Section 18.2, “ITU transfer functions”.

KHR_DF_TRANSFER_NTSC (= 4)

This value represents the non-linear transfer function defined by the original NTSC television broadcast specification. This is described in Section 18.8, “Legacy NTSC transfer functions”.

[Note]

More recent formulations of this transfer functions, such as that defined in SMPTE 170M-2004, use it ITU formulation described above.

KHR_DF_TRANSFER_SLOG (= 5)

This value represents a nonlinear Transfer Function used by some Sony video cameras to represent an increased dynamic range, and is described in Section 18.13, “Sony S-Log transfer functions”.

KHR_DF_TRANSFER_SLOG2 (= 6)

This value represents a nonlinear Transfer Function used by some Sony video cameras to represent a further increased dynamic range, and is described in Section 18.14, “Sony S-Log2 transfer functions”.

KHR_DF_TRANSFER_BT1886 (= 7)

This value represents the nonlinear OETF defined in BT.1886 and described in Section 18.4, “BT.1886 transfer functions”.

KHR_DF_TRANSFER_HLG_OETF (= 8)

This value represents the Hybrid Log Gamma OETF defined by the ITU in BT.2100 for high dynamic range television, and described in Section 18.5, “BT.2100 HLG transfer functions”.

KHR_DF_TRANSFER_HLG_EOTF (= 9)

This value represents the Hybrid Log Gamma OETF defined by the ITU in BT.2100 for high dynamic range television, and described in Section 18.5, “BT.2100 HLG transfer functions”.

KHR_DF_TRANSFER_PQ_EOTF (= 10)

This value represents the Perceptual Quantization EOTF defined by the ITU in BT.2100 for high dynamic range television, and described in Section 18.6, “BT.2100 PQ transfer functions”.

KHR_DF_TRANSFER_PQ_OETF (= 11)

This value represents the Perceptual Quantization EOTF defined by the ITU in BT.2100 for high dynamic range television, and described in Section 18.6, “BT.2100 PQ transfer functions”.

KHR_DF_TRANSFER_DCIP3 (= 12)

This value represents the transfer function defined in DCI P3 and described in Section 18.7, “DCI P3 transfer functions”.

KHR_DF_TRANSFER_PAL_OETF (= 13)

This value represents the OETF for legacy PAL systems described in Section 18.9, “Legacy PAL OETF”.

KHR_DF_TRANSFER_PAL625_EOTF (= 14)

This value represents the EOTF for legacy 625-line PAL systems described in Section 18.10, “Legacy PAL 625-line EOTF”.

KHR_DF_TRANSFER_ST240 (= 15)

This value represents the transfer function associated with the legacy ST-240 (SMPTE240M) standard, described in Section 18.11, “ST240/SMPTE240M transfer functions”.

KHR_DF_TRANSFER_ACESCC (= 16)

This value represents the nonlinear Transfer Function used in the ACEScc Academy Color Encoding System logarithmic encoding system for use within Color Grading Systems, S-2014-003, defined in [aces]. This is described in Section 18.15, “ACEScc transfer function”.

KHR_DF_TRANSFER_ACESCCT (= 17)

This value represents the nonlinear Transfer Function used in the ACEScc Academy Color Encoding System quasi-logarithmic encoding system for use within Color Grading Systems, S-2016-001, defined in [aces]. This is described in Section 18.16, “ACEScct transfer function”.

KHR_DF_TRANSFER_ADOBERGB (= 18)

This value represents the transfer function defined in the Adobe RGB (1998) specification and described in Section 18.12, “Adobe RGB (1998) transfer functions”.

7.8. flags

The format supports some configuration options in the form of boolean flags; these are described in the khr_df_flags_e enumeration and represented in an unsigned 8-bit integer value.

KHR_DF_FLAG_ALPHA_PREMULTIPLIED (= 1)

If the KHR_DF_FLAG_ALPHA_PREMULTIPLIED bit is set, any color information in the data should be interpreted as having been previously scaled by the alpha channel when performing blending operations.

The value KHR_DF_FLAG_ALPHA_STRAIGHT (= 0) is provided to represent this flag not being set, which indicates that the color values in the data should be interpreted as needing to be scaled by the alpha channel when performing blending operations. This flag has no effect if there is no alpha channel in the format.

7.9. texel_block_dimensions_[0..3]

The texel_block_dimensions define the number of coordinates covered by the repeating block described by the samples. Four separate values, represented as unsigned 8-bit integers, are supported, corresponding to successive dimensions. The Basic Data Format Descriptor Block supports up to four dimensions of encoding within a texel block, supporting, for example, a texture with three spatial dimensions and one temporal dimension. Nothing stops the data structure as a whole from having higher dimensionality: for example, a two-dimensional texel block can be used as an element in a six-dimensional look-up table.

The value held in each of these fields is one fewer than the size of the block in that dimension — that is, a value of 0 represents a size of 1, a value of 1 represents a size of 2, etc. A texel block which covers fewer than four dimensions should have a size of 1 in each dimension that it lacks, and therefore the corresponding fields in the representation should be 0.

For example, a Y’CBCR 4:2:0 representation may use a Texel Block of 2×2 pixels in the nominal coordinate space, corresponding to the four Y' samples, as shown in Table 27, “Example Basic Data Format texel_block_dimensions for Y’CBCR 4:2:0”. The texel block dimensions in this case would be 2×2×1×1 (in the X, Y, Z and T dimensions, if the fourth dimension is interpreted as T). The texel_block_dimensions_[0..3] values would therefore be:

Table 27. Example Basic Data Format texel_block_dimensions for Y’CBCR 4:2:0

texel_block_dimensions_0

1

texel_block_dimensions_1

1

texel_block_dimensions_2

0

texel_block_dimensions_3

0


7.10. bytes_plane_[0..7]

The Basic Data Format Descriptor divides the image into a number of planes, each consisting of an integer number of consecutive bytes. The requirement that planes consist of consecutive data means that formats with distinct subsampled channels — such as Y’CBCR 4:2:0 — may require multiple planes to describe a channel. A typical Y’CBCR 4:2:0 image has two planes for the Y' channel in this representation, offset by one line vertically.

The use of byte granularity to define planes is a choice to allow large texels (of up to 255 bytes). A consequence of this is that formats which are not byte-aligned on each addressable unit, such as 1-bit-per-pixel formats, need to represent a texel block of multiple samples, contained within a.

A maximum of eight independent planes is supported in the Basic Data Format Descriptor. Formats which require more than eight planes — which are rare — require an extension.

The bytes_plane_[0..7] fields each contain an unsigned 8-bit integer which represents the number of bytes which that plane contributes to the format. The first field which contains the value 0 indicates that only a subset of the 8 possible planes are present; that is, planes which are not present should be given the bytes_plane value of 0, and any bytes_plane values after the first 0 are ignored. If no bytes_plane value is zero, 8 planes are considered to exist.

As an exception, if bytes_plane_0 has the value 0, the first plane is considered to hold indices into a color palette, which is described by one or more additional planes and samples in the normal way. The first sample in this case should describe a 1×1×1×1 texel holding an unsigned integer value. The number of bits used by the index should be encoded in this sample, with a maximum value of the largest palette entry held in sample_upper. Subsequent samples describe the entries in the palette, starting at an offset of bit 0. Note that the texel block in the index plane is not required to be byte-aligned in this case, and will not be for paletted formats which have small palettes. The channel type for the index is irrelevant.

For example, consider a 5-color paletted texture which describes each of these colors using 8 bits of red, green, blue and alpha. The color model would be RGBSDA, and the format would be described with two planes. bytes_plane_0 would be 0, indicating the special case of a palette, and bytes_plane_1 would be 4, representing the size of the palette entry. The first sample would then have a number of bits corresponding to the number of bits for the palette — in this case, three bits, corresponding the requirements of a 5-color palette. The sample_upper value for this sample is 4, indicating only 5 palette entries. Four subsequent samples represent the red, green, blue and alpha channels, starting from bit 0 as though the index value were not present, and describe the contents of the palette. The full data format descriptor for this example is provided in Table 104, “Four co-sited 8-bit channels in the sRGB color space described by an 5-entry, 3-bit palette” as one of the example format descriptors.

7.11. Sample information

The layout and position of the information within each plane is determined by a number of samples, each consisting of a single channel of data and with a single corresponding position within the texel block, as shown in Table 28, “Basic Data Format Descriptor Sample Information”.

The bytes from the plane data contributing to the format are treated as though they have been concatenated into a bit stream, with the first byte of the lowest-numbered plane providing the lowest bits of the result. Each sample consists of a number of consecutive bits from this bit stream.

If the content for a channel cannot be represented in a single sample, for example because the data for a channel is non-consecutive within this bit stream, additional samples with the same coordinate position and channel number should follow from the first, in order increasing from the least significant bits from the channel data.

Note that some native big-endian formats may need to be supported with multiple samples in a channel, since the constituent bits may not be consecutive in a little-endian interpretation. There is an example, Table 106, “565 RGB format as represented on a big-endian architecture”, in the list of format descriptors provided. In this case, the sample_lower and sample_upper fields for the combined sample are taken from the first sample to belong uniquely to this channel/position pair.

By convention, to avoid aliases for formats, samples should be listed in order starting with channels at the lowest bits of this bit stream. Ties should be broken by increasing channel type id, as shown in Table 110, “Intensity-alpha format showing aliased samples”.

The number of samples present in the format is determined by the descriptor_block_size field. There is no limit on the number of samples which may be present, other than the maximum size of the Data Format Descriptor Block. There is no requirement that samples should access unique parts of the bit-stream: formats such as combined intensity and alpha, or shared exponent formats, require that bits be reused. Nor is there a requirement that all the bits in a plane be used (a format may contain padding).

Table 28. Basic Data Format Descriptor Sample Information

Byte 0 (LSB) Byte 1 Byte 2 Byte 3 (MSB)

bit_offset

bit_length

channel_type

sample_position_0

sample_position_1

sample_position_2

sample_position_3

sample_lower

sample_upper


bit_offset

The bit_offset field describes the offset of the least significant bit of this sample from the least significant bit of the least significant byte of the concatenated bit stream for the format. Typically the bit_offset of the first sample is therefore 0; a sample which begins at an offset of one byte relative to the data format would have a Bit Offset of 8. The Bit Offset is an unsigned 16-bit integer quantity.

bit_length

The bit_length field describes the number of consecutive bits from the concatenated bit stream that contribute to the sample. This field is an unsigned 8-bit integer quantity, and stores the number of bits contributed minus 1; thus a single-byte channel should have a bit_length field value of 7. If a bit_length of more than 256 is required, further samples should be added; the value for the sample is composed in increasing order from least to most significant bit as subsequent samples are processed.

channel_type

The channel_type field is an unsigned 8-bit quantity.

The bottom four bits of the channel_type indicates which channel is being described by this sample. The list of available channels is determined by the color_model field of the Basic Data Format Descriptor Block, and the channel_type field contains the number of the required channel within this list — see the color_model field for the list of channels for each model.

The top four bits of the channel_type are described by the khr_df_sample_datatype_qualifiers_e enumeration:

If the KHR_DF_SAMPLE_DATATYPE_LINEAR bit is not set, the sample value is modified by the transfer function defined in the format’s transfer_function field; if this bit is set, the sample is considered to contain a linearly-encoded value irrespective of the format’s transfer_function.

If the KHR_DF_SAMPLE_DATATYPE_EXPONENT bit is set, this sample holds an exponent (in integer form) for this channel. For example, this would be used to describe the shared exponent location in shared exponent formats (with the exponent bits listed separately under each channel). An exponent is applied to any integer sample of the same type. If this bit is not set, the sample is considered to contain mantissa information. If the KHR_DF_SAMPLE_DATATYPE_SIGNED bit is also set, the exponent is considered to be two’s complement — otherwise it is treated as unsigned. The bias of the exponent can be determined by the exponent’s sample_lower value. The presence or absence of an implicit leading digit in the mantissa of a format with an exponent can be determined by the sample_upper value of the mantissa.

If the KHR_DF_SAMPLE_DATATYPE_SIGNED bit is set, the sample holds a signed value in two’s complement form. If this bit is not set, the sample holds an unsigned value. It is possible to represent a sign/magnitude integer value by having a sample of unsigned integer type with the same channel and sample location as a 1-bit signed sample.

If the KHR_DF_SAMPLE_DATATYPE_FLOAT bit is set, the sample holds floating point data in a conventional format of 10, 11 or 16 bits, as described in Section 16, “Floating-point formats”, or of 32, or 64 bits as described in [IEEE 754]. Unless a genuine unsigned format is intended, KHR_DF_SAMPLE_DATATYPE_SIGNED should be set. Less common floating point representations can be generated with multiple samples and a combination of signed integer, unsigned integer and exponent fields, as described above and in Section 16.4, “Non-standard floating point formats”.

sample_position_[0..3]

The sample has an associated location within the 4-dimensional space of the texel block. Each sample has an offset relative to the 0,0 position of the texel block, determined in units of half a coordinate. This allows the common situation of downsampled channels to have samples conceptually sited at the midpoint between full resolution samples. Support for offsets other than multiples of a half coordinates require an extension. The direction of the sample offsets is determined by the coordinate addressing scheme used by the API. There is no limit on the dimensionality of the data, but if more than four dimensions need to be contained within a single texel block, an extension will be required.

Each sample_position is an 8-bit unsigned integer quantity. sample_position_0 is the X offset of the sample, sample_position_1 is the Y offset of the sample, etc. Formats which use an offset larger than 127.5 in any dimension require an extension.

It is legal, but unusual, to use the same bits to represent multiple samples at different coordinate locations.

sample_lower

Sample_lower, combined with sample_upper, is used to represent the mapping between the numerical value stored in the format and the conceptual numerical interpretation. For unsigned formats, sample_lower typically represents the value which should be interpreted as zero (the black point). For signed formats, sample_lower typically represents “-1”. For color difference models such as Y’CBCR, sample_lower represents the lower extent of the color difference range (which corresponds to an encoding of -0.5 in numerical terms).

If the channel encoding is an integer format, the sample_lower value is represented as a 32-bit integer — signed or unsigned according to whether the channel encoding is signed. Signed negative values should be sign-extended if the channel has fewer than 32-bit, such that the value encoded in sample_lower is itself negative, If the channel encoding is a floating point value, the sample_lower value is also floating point. If the number of bits in the sample is greater than 32, the lowest representable value for sample_lower is interpreted as the smallest value representable in the channel format.

If the channel consists of multiple co-sited integer samples, for example because the channel bits are non-contiguous, there are two possible behaviors. If the total number of bits in the channel is less than or equal to 32, the sample_lower values in the samples corresponding to the least-significant bits of the sample are ignored, and only the sample_lower from the most-significant sample is considered. If the number of bits in the channel exceeds 32, the sample_lower values from the sample corresponding to the most-significant bits within any 32-bit subset of the total number are concatenated to generate the final sample_lower value. For example, a 48-bit signed integer may be encoded in three 16-bit samples. The first sample, corresponding to the least-significant 16 bits, will have its sample_lower value ignored. The next sample of 16 bits takes the total to 32, and so the sample_lower value of this sample should represent the lowest 32 bits of the desired 48-bit virtual sample_lower value. Finally, the third sample indicates the top 16 bits of the 48-bit channel, and its sample_lower contains the top 16 bits of the 48-bit virtual sample_lower value.

The sample_lower value for an exponent should represent the exponent bias — the value that should be subtracted from the encoded exponent to indicate that the mantissa’s sample_upper value will represent 1.0. See Section 16.4, “Non-standard floating point formats” for more detail on this.

For example, the BT.709 television broadcast standard dictates that the Y' value stored in an 8-bit encoding should fall between the range 16 and 235. In this case, sample_lower should contain the value 16.

In OpenGL terminology, a “normalized” channel contains an integer value which is mapped to the range 0..1.0. A channel which is not normalized contains an integer value which is mapped to a floating point equivalent of the integer value. Similarly an “snorm” channel is a signed normalized value mapping from -1.0 to 1.0. Setting sample_lower to the minimum signed integer value representable in the channel is equivalent to defining an “snorm” texture.

sample_upper

Sample_upper, combined with sample_lower, is used to represent the mapping between the numerical value stored in the format and the conceptual numerical interpretation. Sample_upper typically represents the value which should be interpreted as “1.0” (the “white point”). For color difference models such as Y’CBCR, sample_upper represents the lower extent of the color difference range (which corresponds to an encoding of 0.5 in numerical terms).

If the channel encoding is an integer format, the sample_upper value is represented as a 32-bit integer — signed or unsigned according to whether the channel encoding is signed. If the channel encoding is a floating point value, the sample_upper value is also floating point. If the number of bits in the sample is greater than 32, the highest representable value for sample_upper is interpreted as the largest value representable in the channel format. If the channel encoding is the mantissa of a custom floating point format (that is, the encoding is integer but the same sample location and channel is shared by a sample that encodes an exponent), the presence of an implicit "1" digit can be represented by setting the sample_upper value to a value one larger than can be encoded in the available bits for the mantissa, as described in Section 16.4, “Non-standard floating point formats”.

The sample_upper value for an exponent should represent the largest conventional legal exponent value. If the encoded exponent exceeds this value, the encoded floating point value encodes either an infinity or a NaN value, depending on the mantissa. See Section 16.4, “Non-standard floating point formats” for more detail on this.

If the channel consists of multiple co-sited integer samples, for example because the channel bits are non-contiguous, there are two possible behaviors. If the total number of bits in the channel is less than or equal to 32, the sample_upper values in the samples corresponding to the least-significant bits of the sample are ignored, and only the sample_upper from the most-significant sample is considered. If the number of bits in the channel exceeds 32, the sample_upper values from the sample corresponding to the most-significant bits within any 32-bit subset of the total number are concatenated to generate the final sample_upper value. For example, a 48-bit signed integer may be encoded in three 16-bit samples. The first sample, corresponding to the least-significant 16 bits, will have its sample_upper value ignored. The next sample of 16 bits takes the total to 32, and so the sample_upper value of this sample should represent the lowest 32 bits of the desired 48-bit virtual sample_upper value. Finally, the third sample indicates the top 16 bits of the 48-bit channel, and its sample_upper contains the top 16 bits of the 48-bit virtual sample_upper value.

For example, the BT.709 television broadcast standard dictates that the Y' value stored in an 8-bit encoding should fall between the range 16 and 235. In this case, sample_upper should contain the value 235.

In OpenGL terminology, a “normalized” channel contains an integer value which is mapped to the range 0..1.0. A channel which is not normalized contains an integer value which is mapped to a floating point equivalent of the integer value. Similarly an “snorm” channel is a signed normalized value mapping from -1.0 to 1.0. Setting sample_upper to the maximum signed integer value representable in the channel for a signed channel type is equivalent to defining an “snorm” texture. Setting sample_upper to the maximum unsigned value representable in the channel for an unsigned channel type is equivalent to defining a “normalized” texture. Setting sample_upper to “1” is equivalent to defining an “unnormalized” texture.

Sensor data from a camera typically does not cover the full range of the bit depth used to represent it. Sample_upper can be used to specify an upper limit on sensor brightness — or to specify the value which should map to white on the display, which may be less than the full dynamic range of the captured image.

There is no guarantee or expectation that image data be guaranteed to fall between sample_lower and sample_upper unless the users of a format agree that convention.

8. Extension for more complex formats

Some formats will require more channels than can be described in the Basic Format Descriptor, or may have more specific color requirements. For example, it is expected than an extension will be available which places an ICC color profile block into the descriptor block, allowing more color channels to be specified in more precise ways. This will significantly enlarge the space required for the descriptor, and is not expected to be needed for most common uses. A vendor may also use an extension block to associate metadata with the descriptor — for example, information required as part of hardware rendering. So long as software which uses the data format descriptor always uses the total_size field to determine the size of the descriptor, this should be transparent to user code.

The extension mechanism is the preferred way to support even simple extensions such as additional color spaces transfer functions that can be supported by an additional enumeration. This approach improves compatibility with code which is unaware of the additional values. Simple extensions of this form that have cross-vendor support have a good chance of being incorporated more directly into future revisions of the specification, allowing application code to distinguish them by the version_id field.

As an example, consider a single-channel 32-bit depth buffer, as shown in Table 29, “Example of a depth buffer with an extension to indicate a virtual allocation”. A tiled renderer may wish to indicate that this buffer is “virtual”: it will be allocated real memory only if needed, and will otherwise exist only a subset at a time in an on-chip representation. Someone developing such a renderer may choose to add a vendor-specific extension (with ID 0xFFFF to indicate development work and avoid the need for a vendor ID) which uses a boolean to establish whether this depth buffer exists only in virtual form. Note that the mere presence or absence of this extension within the data format descriptor itself forms a boolean, but for this example we will assume that an extension block is always present, and that a boolean is stored within. We will give the enumeration 32 bits, in order to simplify the possible addition of further extensions.

In this example (which should not be taken as an implementation suggestion), the data descriptor would first contain a descriptor block describing the depth buffer format as conventionally described, followed by a second descriptor block that contains only the enumeration. The descriptor itself has a total_size that includes both of these descriptor blocks.

Table 29. Example of a depth buffer with an extension to indicate a virtual allocation

56 (total_size: total size of the two blocks plus one 32-bit value)

Basic descriptor block

0 (vendor_id)

0 (descriptor_type)

0 (version_number)

40 (descriptor_block_size)

RGBSDA (color_model)

UNSPECIFIED (color_primaries)

UNSPECIFIED (transfer_function)

0 (flags)

0 (texel_block_dimension_0)

0 (texel_block_dimension_1)

0 (texel_block_dimension_2)

0 (texel_block_dimension_3)

4 (bytes_plane_0)

0 (bytes_plane_1)

0 (bytes_plane_2)

0 (bytes_plane_3)

0 (bytes_plane_4)

0 (bytes_plane_5)

0 (bytes_plane_6)

0 (bytes_plane_7)

Sample information for the depth value

0 (bit_offset)

32 (bit_length)

SIGNED | FLOAT | DEPTH

0 (sample_position_0)

0 (sample_position_1)

0 (sample_position_2)

0 (sample_position_3)

0xbf800000 (sample_lower: -1.0f)

0x3f800000U (sample_upper: 1.0f)

Extension descriptor block

0xFFFF (vendor_id)

0 (descriptor_type)

0 (version_number)

12 (descriptor_block_size)

Data specific to the extension follows

1 (buffer is “virtual”)


It is possible for a vendor to use the extension block to store peripheral information required to access the image — plane base addresses, stride, etc. Since different implementations have different kinds of non-linear ordering and proprietary alignment requirements, this is not described as part of the standard. By many conventional definitions, this information is not part of the “format”, and particularly it ensures that an identical copy of the image will have a different descriptor block (because the addresses will have changed) and so a simple bitwise comparison of two descriptor blocks will disagree even though the “format” matches. Additionally, many APIs will use the format descriptor only for external communication, and have an internal representation that is more concise and less flexible. In this case, it is likely that address information will need to be represented separately from the format anyway. For these reasons, it is an implementation choice whether to store this information in an extension block, and how to do so, rather than being specified in this standard..

9. Frequently Asked Questions

9.1. Why have a binary format rather than a human-readable one?

While it is not expected that every new container will have a unique data descriptor or that analysis of the data format descriptor will be on a critical path in an application, it is still expected that comparison between formats may be time-sensitive. The data format descriptor is designed to allow relatively efficient queries for subsets of properties, to allow a large number of format descriptors to be stored, and to be amenable to hardware interpretation or processing in shaders. These goals preclude a text-based representation such as an XML schema.

9.2. Why not use an existing representation such as those on FourCC.org?

Formats in FourCC.org do not describe in detail sufficient information for many APIs, and are sometimes inconsistent.

9.3. Why have a descriptive format?

Enumerations are fast and easy to process, but are limited in that any software can only be aware of the enumeration values in place when it was defined. Software often behaves differently according to properties of a format, and must perform a look-up on the enumeration — if it knows what it is — in order to change behaviors. A descriptive format allows for more flexible software which can support a wide range of formats without needing each to be listed, and simplifies the programming of conditional behavior based on format properties.

9.4. Why describe this standard within Khronos?

Khronos supports multiple standards that have a range of internal data representations. There is no requirement that this standard be used specifically with other Khronos standards, but it is hoped that multiple Khronos standards may use this specification as part of a consistent approach to inter-standard operation.

9.5. Why should I use this format if I don’t need most of the fields?

While a library may not use all the data provided in the data format descriptor that is described within this standard, it is common for users of data — particularly pixel-like data — to have additional requirements. Capturing these requirements portably reduces the need for additional metadata to be associated with a proprietary descriptor. It is also common for additional functionality to be added retrospectively to existing libraries — for example, Y’CBCR support is often an afterthought in rendering APIs. Having a consistent and flexible representation in place from the start can reduce the pain of retrofitting this functionality.

Note that there is no expectation that the format descriptor from this standard be used directly, although it can be. The impact of providing a mapping between internal formats and format descriptors is expected to be low, but offers the opportunity both for simplified access from software outside the proprietary library and for reducing the effort needed to provide a complete, unambiguous and accurate description of a format in human-readable terms.

9.6. Why not expand each field out to be integer for ease of decoding?

There is a trade-off between size and decoding effort. It is assumed that data which occupies the same 32-bit word may need to be tested concurrently, reducing the cost of comparisons. When transferring data formats, the packing reduces the overhead. Within these constraints, it is intended that most data can be extracted with low-cost operations, typically being byte-aligned (other than sample flags) and with the natural alignment applied to multi-byte quantities.

9.7. Can this descriptor be used for text content?

For simple ASCII content, there is no reason that plain text could not be described in some way, and this may be useful for image formats that contain comment sections. However, since many multilingual text representations do not have a fixed character size, this use is not seen as an obvious match for this standard.

10. S3TC Compressed Texture Image Formats

This description is derived from the EXT_texture_compression_s3tc extension.

Compressed texture images stored using the S3TC compressed image formats are represented as a collection of $4\times 4$ texel blocks, where each block contains 64 or 128 bits of texel data. The image is encoded as a normal 2D raster image in which each $4\times 4$ block is treated as a single pixel. If an S3TC image has a width or height that is not a multiple of four, the data corresponding to texels outside the image are irrelevant and undefined.

When an S3TC image with a width of w, height of h, and block size of blocksize (8 or 16 bytes) is decoded, the corresponding image size (in bytes) is:

\begin{align*} \left\lceil { w \over 4 } \right\rceil \times \left\lceil { h \over 4 } \right\rceil \times blocksize \end{align*}

When decoding an S3TC image, the block containing the texel at offset $(x,y)$ begins at an offset (in bytes) relative to the base of the image of:

\begin{align*} blocksize \times \left( { \left\lceil { w \over 4 } \right\rceil \times \left\lfloor { y \over 4 } \right\rfloor + \left\lfloor { x \over 4 } \right\rfloor } \right) \end{align*}

The data corresponding to a specific texel $(x,y)$ are extracted from a $4\times 4$ texel block using a relative $(x,y)$ value of

\begin{align*} (x \bmod 4,y \bmod 4) \end{align*}

There are four distinct S3TC image formats:

10.1. BC1 with no alpha

Each $4 \times 4$ block of texels consists of 64 bits of RGB image data.

Each RGB image data block is encoded as a sequence of 8 bytes, called (in order of increasing address):

\begin{align*} c0_{lo}, c0_{hi}, c1_{lo}, c1_{hi}, bits_0, bits_1, bits_2, bits_3 \end{align*}

The 8 bytes of the block are decoded into three quantities:

\begin{align*} color_0 & = c0_{lo} + c0_{hi} \times 256 \\ color_1 & = c1_{lo} + c1_{hi} \times 256 \\ bits & = bits_0 + 256 \times (bits_1 + 256 \times (bits_2 + 256 \times bits_3)) \end{align*}

color0 and color1 are 16-bit unsigned integers that are unpacked to RGB colors RGB0 and RGB1 as though they were 16-bit packed pixels with the R channel in the high 5 bits, G in the next 6 bits and B in the low 5 bits.

bits is a 32-bit unsigned integer, from which a two-bit control code is extracted for a texel at location $(x,y)$ in the block using:

\begin{align*} code(x,y) & = bits[2\times (4\times y+x)+1\ \dots\ 2\times(4\times y+x)+0] \end{align*}

where bit 31 is the most significant and bit 0 is the least significant bit.

The RGB color for a texel at location $(x,y)$ in the block is given in Table 30, “Block decoding for BC1”.

Table 30. Block decoding for BC1

Block value Condition

$RGB_0$

$color_0 > color_1$ and $code(x,y) = 0$

$RGB_1$

$color_0 > color_1$ and $code(x,y) = 1$

$(2\times RGB_9 + RGB_1)\over 3$

$color_0 > color_1$ and $code(x,y) = 2$

$(RGB_0 + 2\times RGB_1)\over 3$

$color_0 > color_1$ and $code(x,y) = 3$

$RGB_0$

$color_0 \le color_1$ and $code(x,y) = 0$

$RGB_1$

$color_0 \le color_1$ and $code(x,y) = 1$

$(RGB_0+RGB_1)\over 2$

$color_0 \le color_1$ and $code(x,y) = 2$

BLACK

$color_0 \le color_1$ and $code(x,y) = 3$


Arithmetic operations are done per component, and BLACK refers to an RGB color where red, green, and blue are all zero.

Since this image has an RGB format, there is no alpha component and the image is considered fully opaque.

10.2. BC1 with alpha

Each $4\times 4$ block of texels consists of 64 bits of RGB image data and minimal alpha information. The RGB components of a texel are extracted in the same way as BC1 with no alpha.

The alpha component for a texel at location $(x,y)$ in the block is given by Table 31, “BC1 with alpha”.

Table 31. BC1 with alpha

Alpha value Condition

$0.0$

$color_0 \le color1$ and $code(x,y) = 3$

$1.0$

otherwise


The red, green, and blue components of any texels with a final alpha of 0 should be encoded as zero (black).

10.3. BC2

Each $4\times 4$ block of texels consists of 64 bits of uncompressed alpha image data followed by 64 bits of RGB image data.

Each RGB image data block is encoded according to the BC1 formats, with the exception that the two code bits always use the non-transparent encodings. In other words, they are treated as though color0 > color1, regardless of the actual values of color0 and color1.

Each alpha image data block is encoded as a sequence of 8 bytes, called (in order of increasing address):

\begin{align*} a_0, a_1, a_2, a_3, a_4, a_5, a_6, a_7 \end{align*}

The 8 bytes of the block are decoded into one 64-bit integer:

\begin{align*} alpha & = a_0 + 256 \times (a_1 + 256 \times (a_2 + 256 \times (a_3 + 256 \times (a_4 + 256 \times (a_5 + 256 \times (a_6 + 256 \times a_7)))))) \end{align*}

alpha is a 64-bit unsigned integer, from which a four-bit alpha value is extracted for a texel at location $(x,y)$ in the block using:

\begin{align*} alpha(x,y) & = bits[4\times(4\times y+x)+3 \dots 4\times(4\times y+x)+0] \end{align*}

where bit 63 is the most significant and bit 0 is the least significant bit.

The alpha component for a texel at location $(x,y)$ in the block is given by $alpha(x,y)\over 15$ .

10.4. BC3

Each $4\times 4$ block of texels consists of 64 bits of compressed alpha image data followed by 64 bits of RGB image data.

Each RGB image data block is encoded according to the BC1 formats, with the exception that the two code bits always use the non-transparent encodings. In other words, they are treated as though color0 > color1, regardless of the actual values of color0 and color1.

Each alpha image data block is encoded as a sequence of 8 bytes, called (in order of increasing address):

\begin{align*} alpha_0, alpha_1, bits_0, bits_1, bits_2, bits_3, bits_4, bits_5 \end{align*}

The $alpha_0$ and $alpha_1$ are 8-bit unsigned bytes converted to alpha components by multiplying by $1\over 255$ .

The 6 bits bytes of the block are decoded into one 48-bit integer:

\begin{align*} bits & = bits_0 + 256 \times (bits_1 + 256 \times (bits_2 + 256 \times (bits_3 + 256 \times (bits_4 + 256 \times bits_5)))) \end{align*}

bits is a 48-bit unsigned integer, from which a three-bit control code is extracted for a texel at location $(x,y)$ in the block using:

\begin{align*} code(x,y) & = bits[3\times(4\times y+x)+2 \dots 3\times(4\times y+x)+0] \end{align*}

where bit 47 is the most-significant and bit 0 is the least-significant bit.

The alpha component for a texel at location $(x,y)$ in the block is given by Table 32, “Alpha encoding for BC3 blocks”.

Table 32. Alpha encoding for BC3 blocks

Alpha value Condition

$alpha0$

$code(x,y) = 0$

$alpha1$

$code(x,y) = 1$

$(6*alpha0 + 1*alpha1)\over 7$

$alpha0 > alpha1$ and $code(x,y) = 2$

$(5*alpha0 + 2*alpha1)\over 7$

$alpha0 > alpha1$ and $code(x,y) = 3$

$(4*alpha0 + 3*alpha1)\over 7$

$alpha0 > alpha1$ and $code(x,y) = 4$

$(3*alpha0 + 4*alpha1)\over 7$

$alpha0 > alpha1$ and $code(x,y) = 5$

$(2*alpha0 + 5*alpha1)\over 7$

$alpha0 > alpha1$ and $code(x,y) = 6$

$(1*alpha0 + 6*alpha1)\over 7$

$alpha0 > alpha1$ and $code(x,y) = 7$

$(4*alpha0 + 1*alpha1)\over 5$

$alpha0 \le alpha1$ and $code(x,y) = 2$

$(3*alpha0 + 2*alpha1)\over 5$

$alpha0 \le alpha1$ and $code(x,y) = 3$

$(2*alpha0 + 3*alpha1)\over 5$

$alpha0 \le alpha1$ and $code(x,y) = 4$

$(1*alpha0 + 4*alpha1)\over 5$

$alpha0 \le alpha1$ and $code(x,y) = 5$

$0.0$

$alpha0 \le alpha1$ and $code(x,y) = 6$

$1.0$

$alpha0 \le alpha1$ and $code(x,y) = 7$


11. RGTC Compressed Texture Image Formats

This description is derived from the “RGTC Compressed Texture Image Formats” section of the OpenGL 4.5 specification.

Compressed texture images stored using the RGTC compressed image encodings are represented as a collection of 4×4 texel blocks, where each block contains 64 or 128 bits of texel data. The image is encoded as a normal 2D raster image in which each 4×4 block is treated as a single pixel. If an RGTC image has a width or height that is not a multiple of four, the data corresponding to texels outside the image are irrelevant and undefined.

When an RGTC image with a width of w, height of h, and block size of blocksize (8 or 16 bytes) is decoded, the corresponding image size (in bytes) is:

\begin{align*} \left\lceil { w \over 4 } \right\rceil \times \left\lceil { h \over 4 } \right\rceil \times blocksize \end{align*}

When decoding an RGTC image, the block containing the texel at offset $(x,y)$ begins at an offset (in bytes) relative to the base of the image of:

\begin{align*} blocksize \times \left( { \left\lceil { w \over 4 } \right\rceil \times \left\lfloor { y \over 4 } \right\rfloor + \left\lfloor { x \over 4 } \right\rfloor } \right) \end{align*}

The data corresponding to a specific texel $(x,y)$ are extracted from a $4 \times 4$ texel block using a relative $(x,y)$ value of

\begin{align*} (x \bmod 4,y \bmod 4). \end{align*}

There are four distinct RGTC image formats:

11.1. BC4 unsigned

Each $4 \times 4$ block of texels consists of 64 bits of unsigned red image data.

Each red image data block is encoded as a sequence of 8 bytes, called (in order of increasing address):

\begin{align*} red_0, red_1, bits_0, bits_1, bits_2, bits_3, bits_4, bits_5 \end{align*}

The 6 $bits_*$ bytes of the block are decoded into a 48-bit bit vector:

\begin{align*} bits & = bits_0 + 256 \times \left( { bits_1 + 256 \times \left( { bits_2 + 256 \times \left( { bits_3 + 256 \times \left( { bits_4 + 256 \times bits_5 } \right) } \right) } \right) } \right) \end{align*}

$red_0$ and $red_1$ are 8-bit unsigned integers that are unpacked to red values $RED_0$ and $RED_1$ .

$bits$ is a 48-bit unsigned integer, from which a three-bit control code is extracted for a texel at location $(x,y)$ in the block using:

\begin{align*} code(x,y) & = bits \left[ 3 \times (4 \times y + x) + 2 \dots 3 \times (4 \times y + x) + 0 \right] \end{align*}

where bit 47 is the most significant and bit 0 is the least significant bit.

The red value $R$ for a texel at location $(x,y)$ in the block is given by Table 33, “Block decoding for BC4”.

Table 33. Block decoding for BC4

R value Condition

$RED_0$

$red_0 > red_1, code(x,y) = 0$

$RED_1$

$red_0 > red_1, code(x,y) = 1$

${ 6 RED_0 + RED_1 } \over 7$

$red_0 > red_1, code(x,y) = 2$

${ 5 RED_0 + 2 RED_1 } \over 7$

$red_0 > red_1, code(x,y) = 3$

${ 4 RED_0 + 3 RED_1 } \over 7$

$red_0 > red_1, code(x,y) = 4$

${ 3 RED_0 + 4 RED_1 } \over 7$

$red_0 > red_1, code(x,y) = 5$

${ 2 RED_0 + 5 RED_1 } \over 7$

$red_0 > red_1, code(x,y) = 6$

${ RED_0 + 6 RED_1 } \over 7$

$red_0 > red_1, code(x,y) = 7$

$RED_0$

$red_0 \leq red_1, code(x,y) = 0$

$RED_1$

$red_0 \leq red_1, code(x,y) = 1$

${ 4 RED_0 + RED_1 } \over 5$

$red_0 \leq red_1, code(x,y) = 2$

${ 3 RED_0 + 2 RED_1 } \over 5$

$red_0 \leq red_1, code(x,y) = 3$

${ 2 RED_0 + 3 RED_1 } \over 5$

$red_0 \leq red_1, code(x,y) = 4$

${ RED_0 + 4 RED_1 } \over 5$

$red_0 \leq red_1, code(x,y) = 5$

$RED_{min}$

$red_0 \leq red_1, code(x,y) = 6$

$RED_{max}$

$red_0 \leq red_1, code(x,y) = 7$


$RED_{min}$ and $RED_{max}$ are 0.0 and 1.0 respectively.

Since the decoded texel has a red format, the resulting RGBA value for the texel is $(R,0,0,1)$ .

11.2. BC4 signed

Each $4 \times 4$ block of texels consists of 64 bits of signed red image data. The red values of a texel are extracted in the same way as BC4 unsigned except $red_0$ , $red_1$ , $RED_0$ , $RED_1$ , $RED_{min}$ , and $RED_{max}$ are signed values defined as follows:

\begin{align*} RED_0 & = \begin{cases} red_0 \over 127.0, & red_0 > -128 \\ -1.0, & red_0 = -128 \end{cases} \\ RED_1 & = \begin{cases} red_1 \over 127.0, & red_1 > -128 \\ -1.0, & red_0 = -128 \end{cases} \\ RED_{min} & = -1.0 \\ RED_{max} & = 1.0 \end{align*}

$red_0$ and $red_1$ are 8-bit signed (twos complement) integers.

CAVEAT for signed $red_0$ and $red_1$ values: the expressions $red_0 > red_1$ and $red_0 \leq red_1$ above are considered undefined (read: may vary by implementation) when $red_0 = -127$ and $red_1 = -128$ . This is because if $red_0$ were remapped to -127 prior to the comparison to reduce the latency of a hardware decompressor, the expressions would reverse their logic. Encoders for the signed red-green formats should avoid encoding blocks where $red_0 = -127$ and $red_1 = -128$ .

11.3. BC5 unsigned

Each $4 \times 4$ block of texels consists of 64 bits of compressed unsigned red image data followed by 64 bits of compressed unsigned green image data.

The first 64 bits of compressed red are decoded exactly like BC4 unsigned above.

The second 64 bits of compressed green are decoded exactly like BC4 unsigned above except the decoded value $R$ for this second block is considered the resulting green value $G$ .

Since the decoded texel has a red-green format, the resulting RGBA value for the texel is $(R,G,0,1)$ .

11.4. BC5 signed

Each $4 \times 4$ block of texels consists of 64 bits of compressed signed red image data followed by 64 bits of compressed signed green image data.

The first 64 bits of compressed red are decoded exactly like BC4 signed above.

The second 64 bits of compressed green are decoded exactly like BC4 signed above except the decoded value $R$ for this second block is considered the resulting green value $G$ .

Since this image has a red-green format, the resulting RGBA value is $(R,G,0,1)$ .

12. BPTC Compressed Texture Image Formats

This description is derived from the “BPTC Compressed Texture Image Formats” section of the OpenGL 4.5 specification.

Compressed texture images stored using the BPTC compressed image formats are represented as a collection of $4 \times 4$ texel blocks, where each block contains 128 bits of texel data. The image is encoded as a normal 2D raster image in which each $4 \times 4$ block is treated as a single pixel. If a BPTC image has a width or height that is not a multiple of four, the data corresponding to texels outside the image are irrelevant and undefined.

When a BPTC image with a width of w, height of h, and block size of blocksize (16 bytes) is decoded, the corresponding image size (in bytes) is:

\begin{align*} \left\lceil { w \over 4 } \right\rceil \times \left\lceil { h \over 4 } \right\rceil \times blocksize \end{align*}

When decoding a BPTC image, the block containing the texel at offset $(x,y)$ begins at an offset (in bytes) relative to the base of the image of:

\begin{align*} blocksize \times \left( { \left\lceil { w \over 4 } \right\rceil \times \left\lfloor { y \over 4 } \right\rfloor + \left\lfloor { x \over 4 } \right\rfloor } \right) \end{align*}

The data corresponding to a specific texel $(x,y)$ are extracted from a $4 \times 4$ texel block using a relative $(x,y)$ value of:

\begin{align*} (x \bmod 4,y \bmod 4) \end{align*}

There are two distinct BPTC image formats each of which has two variants. BC7 with or without an sRGB transform function used in the encoding of the RGB channels compresses 8-bit unsigned, normalized fixed-point data. BC6H in signed or unsigned form compresses high dynamic range floating-point values. The formats are similar, so the description of the BC6H format will reference significant sections of the BC7 description.

12.1. BC7

Each $4 \times 4$ block of texels consists of 128 bits of RGBA image data, of which the RGB components may be encoded linearly or with the sRGB transfer function.

Each block contains enough information to select and decode a pair of colors called endpoints, interpolate between those endpoints in a variety of ways, then remap the result into the final output.

Each block can contain data in one of eight modes. The mode is identified by the lowest bits of the lowest byte. It is encoded as zero or more zeros followed by a one. For example, using x to indicate a bit not included in the mode number, mode 0 is encoded as xxxxxxx1 in the low byte in binary, mode 5 is xx100000, and mode 7 is 10000000. Encoding the low byte as zero is reserved and should not be used when encoding a BPTC texture.

All further decoding is driven by the values derived from the mode listed in Table 36, “Mode-dependent BPTC parameters.” and Table 37, “The full descriptions of the BPTC mode columns are as follows”. The fields in the block are always in the same order for all modes. Starting at the lowest bit after the mode and going up, these fields are: partition number, rotation, index selection, color, alpha, per-endpoint P-bit, shared P-bit, primary indices, and secondary indices. The number of bits to be read in each field is determined directly from the table.

Each block can be divided into between 1 and 3 groups of pixels with independent compression parameters called subsets. A texel in a block with one subset is always considered to be in subset zero. Otherwise, a number determined by the number of partition bits is used to look up in Table 38, “Partition table for BPTC 2 subset, with the 4×4 block of values for each partition index value” or Table 39, “Partition table for BPTC 3 subset, with the 4×4 block of values for each partition index value” for 2 and 3 subsets respectively. This partitioning is indexed by the X and Y within the block to generate the subset index.

Each block has two colors for each subset, stored first by endpoint, then by subset, then by color. For example, a format with two subsets and five color bits would have five bits of red for endpoint 0 of the first subset, then five bits of red for endpoint 1, then the two ends of the second subset, then green and blue stored similarly. If a block has non-zero alpha bits, the alpha data follows the color data with the same organization. If not, alpha is overridden to 1.0. These bits are treated as the high bits of a fixed-point value in a byte. If the format has a shared P-bit, there are two bits for endpoints 0 and 1 from low to high. If the format has a per-endpoint P-bits, then there are $2 \times subsets$ P-bits stored in the same order as color and alpha. Both kinds of P-bits are added as a bit below the color data stored in the byte. So, for a format with 5 red bits, the P-bit ends up in bit 2. For final scaling, the top bits of the value are replicated into any remaining bits in the byte. For the preceding example, bits 6 and 7 would be written to bits 0 and 1.

The endpoint colors are interpolated using index values stored in the block. The index bits are stored in x-major order. Each index has the number of bits indicated by the mode except for one special index per subset called the anchor index. Since the ordering of the endpoints is unimportant, we can save one bit on one index per subset by ordering the endpoints such that the highest bit is guaranteed to be zero. In partition zero, the anchor index is always index zero. In other partitions, the anchor index is specified by Table 40, “BPTC anchor index values for the second subset of two-subset partitioning. Values run right, then down.”, Table 41, “BPTC anchor index values for the second subset of three-subset partitioning. Values run right, then down.”, and Table 42, “BPTC anchor index values for the third subset of three-subset partitioning. Values run right, then down.”. If secondary index bits are present, they are read in the same manner. The anchor index information is only used to determine the number of bits each index has when it’s read from the block data.

The endpoint color and alpha values used for final interpolation are the decoded values corresponding to the applicable subset as selected above. The index value for interpolating color comes from the secondary index for the texel if the format has an index selection bit and its value is one and from the primary index otherwise. The alpha index comes from the secondary index if the block has a secondary index and the block either doesn’t have an index selection bit or that bit is zero and the primary index otherwise.

Interpolation is always performed using a 6-bit interpolation factor. The effective interpolation factors for 2-, 3-, and 4-bit indices are given in Table 34, “BPTC interpolation factors”.

Table 34. BPTC interpolation factors

2

0

21

43

64

3

0

9

18

27

37

46

55

64

4

0

4

9

13

17

21

26

30

34

38

43

47

51

55

60

64


The interpolation results in an RGBA color. If rotation bits are present, this color is remapped according to Table 35, “BPTC Rotation bits”.

Table 35. BPTC Rotation bits

0

no change

1

swap(a,r)

2

swap(a,g)

3

swap(a,b)


These 8-bit values should be interpreted as RGBA 8-bit normalized channels, either linearly encoded or with the sRGB transfer function.

Table 36. Mode-dependent BPTC parameters.

Mode NS PB RB ISB CB AB EPB SPB IB IB2

0

3

4

0

0

4

0

1

0

3

0

1

2

6

0

0

6

0

0

1

3

0

2

3

6

0

0

5

0

0

0

2

0

3

2

6

0

0

7

0

1

0

2

0

4

1

0

2

1

5

6

0

0

2

3

5

1

0

2

0

7

8

0

0

2

2

6

1

0

0

0

7

7

1

0

4

0

7

2

6

0

0

5

5

1

0

2

0


Table 37. The full descriptions of the BPTC mode columns are as follows

Mode As described previously

NS

Number of subsets in each partition

PB

Partition bits

RB

Rotation bits

ISB

Index selection bits

CB

Color bits

AB

Alpha bits

EPB

Endpoint P-bits

SPB

Shared P-bits

IB

Index bits per element

IB2

Secondary index bits per element}


Table 38. Partition table for BPTC 2 subset, with the 4×4 block of values for each partition index value

0

1

2

3

4

5

6

7

0

0

1

1

0

0

0

1

0

1

1

1

0

0

0

1

0

0

0

0

0

0

1

1

0

0

0

1

0

0

0

0

0

0

1

1

0

0

0

1

0

1

1

1

0

0

1

1

0

0

0

1

0

1

1

1

0

0

1

1

0

0

0

1

0

0

1

1

0

0

0

1

0

1

1

1

0

0

1

1

0

0

0

1

0

1

1

1

0

1

1

1

0

0

1

1

0

0

1

1

0

0

0

1

0

1

1

1

0

1

1

1

0

0

1

1

1

1

1

1

1

1

1

1

0

1

1

1

8

9

10

11

12

13

14

15

0

0

0

0

0

0

1

1

0

0

0

0

0

0

0

0

0

0

0

1

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

1

1

1

0

0

0

1

0

0

0

0

0

1

1

1

0

0

0

0

1

1

1

1

0

0

0

0

0

0

0

1

1

1

1

1

0

1

1

1

0

0

0

1

1

1

1

1

1

1

1

1

1

1

1

1

0

0

0

0

0

0

1

1

1

1

1

1

1

1

1

1

0

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

16

17

18

19

20

21

22

23

0

0

0

0

0

1

1

1

0

0

0

0

0

1

1

1

0

0

1

1

0

0

0

0

0

0

0

0

0

1

1

1

1

0

0

0

0

0

0

1

0

0

0

0

0

0

1

1

0

0

0

1

1

0

0

0

0

0

0

0

0

0

1

1

1

1

1

0

0

0

0

0

1

0

0

0

0

0

0

1

0

0

0

0

1

1

0

0

1

0

0

0

0

0

1

1

1

1

1

1

0

0

0

0

1

1

1

0

0

0

0

0

0

0

0

0

1

1

1

0

1

1

0

0

0

0

0

1

24

25

26

27

28

29

30

31

0

0

1

1

0

0

0

0

0

1

1

0

0

0

1

1

0

0

0

1

0

0

0

0

0

1

1

1

0

0

1

1

0

0

0

1

1

0

0

0

0

1

1

0

0

1

1

0

0

1

1

1

1

1

1

1

0

0

0

1

1

0

0

1

0

0

0

1

1

0

0

0

0

1

1

0

0

1

1

0

1

1

1

0

1

1

1

1

1

0

0

0

1

0

0

1

0

0

0

0

1

1

0

0

0

1

1

0

1

1

0

0

1

0

0

0

0

0

0

0

1

1

1

0

1

1

0

0

32

33

34

35

36

37

38

39

0

1

0

1

0

0

0

0

0

1

0

1

0

0

1

1

0

0

1

1

0

1

0

1

0

1

1

0

0

1

0

1

0

1

0

1

1

1

1

1

1

0

1

0

0

0

1

1

1

1

0

0

0

1

0

1

1

0

0

1

1

0

1

0

0

1

0

1

0

0

0

0

0

1

0

1

1

1

0

0

0

0

1

1

1

0

1

0

0

1

1

0

1

0

1

0

0

1

0

1

1

1

1

1

1

0

1

0

1

1

0

0

1

1

0

0

1

0

1

0

1

0

0

1

0

1

0

1

40

41

42

43

44

45

46

47

0

1

1

1

0

0

0

1

0

0

1

1

0

0

1

1

0

1

1

0

0

0

1

1

0

1

1

0

0

0

0

0

0

0

1

1

0

0

1

1

0

0

1

0

1

0

1

1

1

0

0

1

1

1

0

0

0

1

1

0

0

1

1

0

1

1

0

0

1

1

0

0

0

1

0

0

1

1

0

1

1

0

0

1

1

1

0

0

1

0

0

1

0

1

1

0

1

1

1

0

1

0

0

0

1

1

0

0

1

1

0

0

0

1

1

0

0

0

1

1

1

0

0

1

0

0

0

0

48

49

50

51

52

53

54

55

0

1

0

0

0

0

1

0

0

0

0

0

0

0

0

0

0

1

1

0

0

0

1

1

0

1

1

0

0

0

1

1

1

1

1

0

0

1

1

1

0

0

1

0

0

1

0

0

1

1

0

0

0

1

1

0

0

0

1

1

1

0

0

1

0

1

0

0

0

0

1

0

0

1

1

1

1

1

1

0

1

0

0

1

1

1

0

0

1

0

0

1

1

1

0

0

0

0

0

0

0

0

0

0

0

0

1

0

0

1

0

0

0

0

1

1

1

0

0

1

1

1

0

0

0

1

1

0

56

57

58

59

60

61

62

63

0

1

1

0

0

1

1

0

0

1

1

1

0

0

0

1

0

0

0

0

0

0

1

1

0

0

1

0

0

1

0

0

1

1

0

0

0

0

1

1

1

1

1

0

1

0

0

0

1

1

1

1

0

0

1

1

0

0

1

0

0

1

0

0

1

1

0

0

0

0

1

1

1

0

0

0

1

1

1

0

0

0

1

1

1

1

1

1

1

1

1

0

0

1

1

1

1

0

0

1

1

0

0

1

0

0

0

1

0

1

1

1

0

0

1

1

0

0

0

0

1

1

1

0

0

1

1

1


Table 39. Partition table for BPTC 3 subset, with the 4×4 block of values for each partition index value

0

1

2

3

4

5

6

7

0

0

1

1

0

0

0

1

0

0

0

0

0

2

2

2

0

0

0

0

0

0

1

1

0

0

2

2

0

0

1

1

0

0

1

1

0

0

1

1

2

0

0

1

0

0

2

2

0

0

0

0

0

0

1

1

0

0

2

2

0

0

1

1

0

2

2

1

2

2

1

1

2

2

1

1

0

0

1

1

1

1

2

2

0

0

2

2

1

1

1

1

2

2

1

1

2

2

2

2

2

2

2

1

2

2

1

1

0

1

1

1

1

1

2

2

0

0

2

2

1

1

1

1

2

2

1

1

8

9

10

11

12

13

14

15

0

0

0

0

0

0

0

0

0

0

0

0

0

0

1

2

0

1

1

2

0

1

2

2

0

0

1

1

0

0

1

1

0

0

0

0

1

1

1

1

1

1

1

1

0

0

1

2

0

1

1

2

0

1

2

2

0

1

1

2

2

0

0

1

1

1

1

1

1

1

1

1

2

2

2

2

0

0

1

2

0

1

1

2

0

1

2

2

1

1

2

2

2

2

0

0

2

2

2

2

2

2

2

2

2

2

2

2

0

0

1

2

0

1

1

2

0

1

2

2

1

2

2

2

2

2

2

0

16

17

18

19

20

21

22

23

0

0

0

1

0

1

1

1

0

0

0

0

0

0

2

2

0

1

1

1

0

0

0

1

0

0

0

0

0

0

0

0

0

0

1

1

0

0

1

1

1

1

2

2

0

0

2

2

0

1

1

1

0

0

0

1

0

0

1

1

1

1

0

0

0

1

1

2

2

0

0

1

1

1

2

2

0

0

2

2

0

2

2

2

2

2

2

1

0

1

2

2

2

2

1

0

1

1

2

2

2

2

0

0

1

1

2

2

1

1

1

1

0

2

2

2

2

2

2

1

0

1

2

2

2

2

1

0

24

25

26

27

28

29

30

31

0

1

2

2

0

0

1

2

0

1

1

0

0

0

0

0

0

0

2

2

0

1

1

0

0

0

1

1

0

0

0

0

0

1

2

2

0

0

1

2

1

2

2

1

0

1

1

0

1

1

0

2

0

1

1

0

0

1

2

2

2

0

0

0

0

0

1

1

1

1

2

2

1

2

2

1

1

2

2

1

1

1

0

2

2

0

0

2

0

1

2

2

2

2

1

1

0

0

0

0

2

2

2

2

0

1

1

0

1

2

2

1

0

0

2

2

2

2

2

2

0

0

1

1

2

2

2

1

32

33

34

35

36

37

38

39

0

0

0

0

0

2

2

2

0

0

1

1

0

1

2

0

0

0

0

0

0

1

2

0

0

1

2

0

0

0

1

1

0

0

0

2

0

0

2

2

0

0

1

2

0

1

2

0

1

1

1

1

1

2

0

1

2

0

1

2

2

2

0

0

1

1

2

2

0

0

1

2

0

0

2

2

0

1

2

0

2

2

2

2

2

0

1

2

1

2

0

1

1

1

2

2

1

2

2

2

0

0

1

1

0

2

2

2

0

1

2

0

0

0

0

0

0

1

2

0

0

1

2

0

0

0

1

1

40

41

42

43

44

45

46

47

0

0

1

1

0

1

0

1

0

0

0

0

0

0

2

2

0

0

2

2

0

2

2

0

0

1

0

1

0

0

0

0

1

1

2

2

0

1

0

1

0

0

0

0

1

1

2

2

0

0

1

1

1

2

2

1

2

2

2

2

2

1

2

1

2

2

0

0

2

2

2

2

2

1

2

1

0

0

2

2

0

0

2

2

0

2

2

0

2

2

2

2

2

1

2

1

0

0

1

1

2

2

2

2

2

1

2

1

1

1

2

2

0

0

1

1

1

2

2

1

0

1

0

1

2

1

2

1

48

49

50

51

52

53

54

55

0

1

0

1

0

2

2

2

0

0

0

2

0

0

0

0

0

2

2

2

0

0

0

2

0

1

1

0

0

0

0

0

0

1

0

1

0

1

1

1

1

1

1

2

2

1

1

2

0

1

1

1

1

1

1

2

0

1

1

0

0

0

0

0

0

1

0

1

0

2

2

2

0

0

0

2

2

1

1

2

0

1

1

1

1

1

1

2

0

1

1

0

2

1

1

2

2

2

2

2

0

1

1

1

1

1

1

2

2

1

1

2

0

2

2

2

0

0

0

2

2

2

2

2

2

1

1

2

56

57

58

59

60

61

62

63

0

1

1

0

0

0

2

2

0

0

2

2

0

0

0

0

0

0

0

2

0

2

2

2

0

1

0

1

0

1

1

1

0

1

1

0

0

0

1

1

1

1

2

2

0

0

0

0

0

0

0

1

1

2

2

2

2

2

2

2

2

0

1

1

2

2

2

2

0

0

1

1

1

1

2

2

0

0

0

0

0

0

0

2

0

2

2

2

2

2

2

2

2

2

0

1

2

2

2

2

0

0

2

2

0

0

2

2

2

1

1

2

0

0

0

1

1

2

2

2

2

2

2

2

2

2

2

0


Table 40. BPTC anchor index values for the second subset of two-subset partitioning. Values run right, then down.

15

15

15

15

15

15

15

15

15

15

15

15

15

15

15

15

15

2

8

2

2

8

8

15

2

8

2

2

8

8

2

2

15

15

6

8

2

8

15

15

2

8

2

2

2

15

15

6

6

2

6

8

15

15

2

2

15

15

15

15

15

2

2

15


Table 41. BPTC anchor index values for the second subset of three-subset partitioning. Values run right, then down.

3

3

15

15

8

3

15

15

8

8

6

6

6

5

3

3

3

3

8

15

3

3

6

10

5

8

8

6

8

5

15

15

8

15

3

5

6

10

8

15

15

3

15

5

15

15

15

15

3

15

5

5

5

8

5

10

5

10

8

13

15

12

3

3


Table 42. BPTC anchor index values for the third subset of three-subset partitioning. Values run right, then down.

15

8

8

3

15

15

3

8

15

15

15

15

15

15

15

8

15

8

15

3

15

8

15

8

3

15

6

10

15

15

10

8

15

3

15

10

10

8

9

10

6

15

8

15

3

6

6

8

15

3

15

15

15

15

15

15

15

15

15

15

3

15

15

8


12.2. BC6H

Each $4 \times 4$ block of texels consists of 128 bits of RGB data. These formats are very similar and will be described together. In the description and pseudocode below, signed will be used as a condition which is true for the signed version of the format and false for the unsigned version of the format. Both formats only contain RGB data, so the returned alpha value is 1.0. If a block uses a reserved or invalid encoding, the return value is $(0,0,0,1)$ .

Each block can contain data in one of 14 modes. The mode number is encoded in either the low two bits or the low five bits. If the low two bits are less than two, that is the mode number, otherwise the low five bits the mode number. Mode numbers not listed in Table 43, “Endpoint and partition parameters for BPTC block modes” are reserved (19, 23, 27, and 31).

The data for the compressed blocks is stored in a different format for each mode. The formats are specified in Table 44, “Block formats for BC6H block modes”. The format strings are intended to be read from left to right with the LSB on the left. Each element is of the form $v[a \colon b]$ . If $a \ge b$ , this indicates extracting $b-a+1$ bits from the block at that location and put them in the corresponding bits of the variable v. If a < b, then the bits are reversed. $v[a]$ is used as a shorthand for the one bit $v[a \colon a]$ . As an example, $m[1 \colon0 ],g2[4]$ would move the low two bits from the block into the low two bits of m then the next bit of the block into bit 4 of $g2$ . The variable names given in the table will be referred to in the language below.

Subsets and indices work in much the same way as described for the fixed-point formats above. If a float block has no partition bits, then it is a single-subset block. If it has partition bits, then it is a 2 subset block. The partition index references the first half of Table 38, “Partition table for BPTC 2 subset, with the 4×4 block of values for each partition index value”. Indices are read in the same way as the fixed-point formats including obeying the anchor values for index 0 and as needed by Table 40, “BPTC anchor index values for the second subset of two-subset partitioning. Values run right, then down.”.

In a single-subset blocks, the two endpoints are contained in $r_0,g_0,b_0$ (hence $e_0$ ) and $r_1,g_1,b_1$ (hence $e_1$ ). In a two-subset block, the endpoints for the second subset are in $r_2,g_2,b_2$ and $r_3,g_3,b_3$ . The value in $e_0$ is sign-extended if the format of the texture is signed. The values in $e_1$ (and $e_2$ and $e_3$ if the block is two-subset) are sign-extended if the format of the texture is signed or if the block mode has transformed endpoints. If the mode has transformed endpoints, the values from $e_0$ are used as a base to offset all other endpoints, wrapped at the number of endpoint bits. For example, $r_1 = (r_0+r_1)$ & $((1 \ll EPB)-1)$ .

Next, the endpoints are unquantized to maximize the usage of the bits and to ensure that the negative ranges are oriented properly to interpolate as a two’s complement value. The following pseudocode assumes the computation is being done using sufficiently large intermediate values to avoid overflow. For the unsigned float format, we unquantize a value $x$ to $unq$ by:

if (EPB >= 15)
    unq = x;
else if (x == 0)
    unq = 0;
else if (x == ((1 << EPB)-1))
    unq = 0xFFFF;
else
    unq = ((x << 15) + 0x4000) >> (EPB-1);

The signed float unquantization is similar, but needs to worry about orienting the negative range:

s = 0;
if (EPB >= 16) {
    unq = x;
} else {
    if (x < 0) {
        s = 1;
        x = -x;
    }

    if (x == 0)
        unq = 0;
    else if (x >= ((1 << (EPB-1))-1))
        unq = 0x7FFF;
    else
        unq = ((x << 15) + 0x4000) >> (EPB-1);

    if (s)
        unq = -unq;
}

After the endpoints are unquantized, interpolation proceeds as in the fixed-point formats above including the interpolation weight table.

The interpolated values are passed through a final unquantization step. For the unsigned format, this step simply multiplies by $31 \over 64$ . The signed format negates negative components, multiplies by $31 \over 32$ , then ORs in the sign bit if the original value was negative.

The resultant value should be a legal 16-bit half float.

Table 43. Endpoint and partition parameters for BPTC block modes

Mode Number Transformed Endpoints Partition Bits (PB) Endpoint Bits (EPB) Delta Bits

0

1

5

10

{5, 5, 5}

1

1

5

7

{6, 6, 6}

2

1

5

11

{5, 4, 4}

6

1

5

11

{4, 5, 4}

10

1

5

11

{4, 4, 5}

14

1

5

9

{5, 5, 5}

18

1

5

8

{6, 5, 5}

22

1

5

8

{5, 6, 5}

26

1

5

8

{5, 5, 6}

30

0

5

6

{6, 6, 6}

3

0

0

10

{10, 10, 10}

7

1

0

11

{9, 9, 9}

11

1

0

12

{8, 8, 8}

15

1

0

16

{4, 4, 4}


Table 44. Block formats for BC6H block modes

Mode Number Block Format

0

m[1:0], g2[4], b2[4], b3[4], r0[9(:0], g0[9:0], b0[9:0], r1[4:0], g3[4], g2[3:0], g1[4:0], b3[0], g3[3:0], b1[4:0], b3[1], b2[3:0], r2[4:0], b3[2], r3[4:0],

1

m[1:0], g2[5], g3[4], g3[5], r0[6:0], b3[0], b3[1], b2[4], g0[6:0], b2[5], b3[2], g2[4], b0[6:0], b3[3], b3[5], b3[4], r1[5:0], g2[3:0], g1[5:0], g3[3:0], b1[5:0], b2[3:0], r2[5:0], r3[5:0]

2

m[4:0], r0[9:0], g0[9:0], b0[9:0], r1[4:0], r0[10], g2[3:0], g1[3:0], g0[10], b3[0], g3[3:0], b1[3:0], b0[10], b3[1], b2[3:0], r2[4:0], b3[2], r3[4:0],

6

m[4:0], r0[9:0], g0[9:0], b0[9:0], r1[3:0], r0[10], g3[4], g2[3:0], g1[4:0], g0[10], g3[3:0], b1[3:0], b0[10], b3[1], b2[3:0], r2[3:0], b3[0], b3[2], r3[3:0], g2[4], b3[3]

10

m[4:0], r0[9:0], g0[9:0], b0[9:0], r1[3:0], r0[10], b2[4], g2[3:0], g1[3:0], g0[10], b3[0], g3[3:0], b1[4:0], b0[10], b2[3:0], r2[3:0], b3[1], b3[2], r3[3:0], b3[4], b3[3]

14

m[4:0], r0[8:0], b2[4], g0[8:0], g2[4], b0[8:0], b3[4], r1[4:0], g3[4], g2[3:0], g1[4:0], b3[0], g3[3:0], b1[4:0], b3[1], b2[3:0], r2[4:0], b3[2], r3[4:0], b3[3]

18

m[4:0], r0[7:0], g3[4], b2[4], g0[7:0], b3[2], g2[4], b0[7:0], b3[3], b3[4], r1[5:0], g2[3:0], g1[4:0], b3[0], g3[3:0], b1[4:0], b3[1], b2[3:0], r2[5:0], r3[5:0]

22

m[4:0], r0[7:0], b3[0], b2[4], g0[7:0], g2[5], g2[4], b0[7:0], g3[5], b3[4], r1[4:0], g3[4], g2[3:0], g1[5:0], g3[3:0], b1[4:0], b3[1], b2[3:0], r2[4:0], b3[2], r3[4:0], b3[3]

26

m[4:0], r0[7:0], b3[1], b2[4], g0[7:0], b2[5], g2[4], b0[7:0], b3[5], b3[4], r1[4:0], g3[4], g2[3:0], g1[4:0], b3[0], g3[3:0], b1[5:0], b2[3:0], r2[4:0], b3[2], r3[4:0], b3[3]

30

m[4:0], r0[5:0], g3[4], b3[0], b3[1], b2[4], g0[5:0], g2[5], b2[5], b3[2], g2[4], b0[5:0], g3[5], b3[3], b3[5], b3[4], r1[5:0], g2[3:0], g1[5:0], g3[3:0], b1[5:0], b2[3:0], r2[5:0], r3[5:0]

3

m[4:0], r0[9:0], g0[9:0], b0[9:0], r1[9:0], g1[9:0], b1[9:0]

7

m[4:0], r0[9:0], g0[9:0], b0[9:0], r1[8:0], r0[10], g1[8:0], g0[10], b1[8:0], b0[10]

11

m[4:0], r0[9:0], g0[9:0], b0[9:0], r1[7:0], r0[10:11], g1[7:0], g0[10:11], b1[7:0], b0[10:11]

15

m[4:0], r0[9:0], g0[9:0], b0[9:0], r1[3:0], r0[10:15], g1[3:0], g0[10:15], b1[3:0], b0[10:15]


13. ETC1 Compressed Texture Image Formats

This description is derived from the OES_compressed_ETC1_RGB8_texture OpenGL extension.

The texture is described as a number of $4\times 4$ pixel blocks. If the texture (or a particular mip-level) is smaller than 4 pixels in any dimension (such as a $2\times 2$ or a $8\times 1$ texture), the texture is found in the upper left part of the block(s), and the rest of the pixels are not used. For instance, a texture of size $4\times 2$ will be placed in the upper half of a $4\times 4$ block, and the lower half of the pixels in the block will not be accessed.

Pixel a1 (see Table 48, “Pixel layout for an 8×8 texture using four ETC1 compressed blocks. Note how pixel a2 in the second block is adjacent to pixel m1 in the first block.”) of the first block in memory will represent the texture coordinate (u=0, v=0). Pixel a2 in the second block in memory will be adjacent to pixel m1 in the first block, etc until the width of the texture. Then pixel a3 in the following block (third block in memory for a $8\times 8$ texture) will be adjacent to pixel d1 in the first block, etc until the height of the texture. The data storage for an $8\times 8$ texture using the first, second, third and fourth block if stored in that order in memory would have the texels encoded in the same order as a simple linear format as if the bytes describing the pixels came in the following memory order: $a_1$ $e_1$ $i_1$ $m_1$ $a_2$ $e_2$ $i_2$ $m_2$ $b_1$ $f_1$ $j_1$ $n_1$ $b_2$ $f_2$ $j_2$ $n_2$ $c_1$ $g_1$ $k_1$ $o_1$ $c_2$ $g_2$ $k_2$ $o_2$ $d_1$ $h_1$ $l_1$ $p_1$ $d_2$ $h_2$ $l_2$ $p_2$ $a_3$ $e_3$ $i_3$ $m_3$ $a_4$ $e_4$ $i_4$ $m_4$ $b_3$ $f_3$ $j_3$ $n_3$ $b_4$ $f_4$ $j_4$ $n_4$ $c_3$ $g_3$ $k_3$ $o_3$ $c_4$ $g_4$ $k_4$ $o_4$ $d_3$ $h_3$ $l_3$ $p_3$ $d_4$ $h_4$ $l_4$ $p_4$ .

The number of bits that represent a $4\times 4$ texel block is 64 bits.

The data for a block is stored as a number of bytes, $\{q_0, q_1, q_2, q_3, q_4, q_5, q_6, q_7\}$ , where byte $q_0$ is located at the lowest memory address and $q_7$ at the highest. The 64 bits specifying the block are then represented by the following 64 bit integer:

\begin{align*} int64bit & = 256\times(256\times(256\times(256\times(256\times(256\times(256\times q_0+q_1)+q_2)+q_3)+q_4)+q_5)+q_6)+q_7 \end{align*}

Each 64-bit word contains information about a $4\times 4$ pixel block as shown in Table 49, “Pixel layout for an ETC1 compressed block”. There are two modes in ETC1; the ‘individual’ mode and the ‘differential’ mode. Which mode is active for a particular $4\times 4$ block is controlled by bit 33, which we call ‘diffbit’. If diffbit = 0, the ‘individual’ mode is chosen, and if diffbit = 1, then the ‘differential’ mode is chosen. The bit layout for the two modes are different: The bit layout for the individual mode is shown in Table 45, “Texel Data format for ETC1 compressed textures” part a and part c, and the bit layout for the differential mode is laid out in Table 45, “Texel Data format for ETC1 compressed textures” part b and part c.

In both modes, the $4\times 4$ block is divided into two subblocks of either size $2\times 4$ or $4\times 2$ . This is controlled by bit 32, which we call ‘flipbit’. If flipbit=0, the block is divided into two $2\times 4$ subblocks side-by-side, as shown in Table 50, “Two 2×4-pixel ETC1 subblocks side-by-side”. If flipbit=1, the block is divided into two $4\times 2$ subblocks on top of each other, as shown in Table 51, “Two 4×2-pixel ETC1 subblocks on top of each other”.

In both individual and differential mode, a ‘base color’ for each subblock is stored, but the way they are stored is different in the two modes:

In the ‘individual’ mode (diffbit = 0), the base color for subblock 1 is derived from the codewords R1 (bit 63-60), G1 (bit 55-52) and B1 (bit 47-44), see section a of Table 45, “Texel Data format for ETC1 compressed textures”. These four bit values are extended to RGB888 by replicating the four higher order bits in the four lower order bits. For instance, if R1 = 14 = 1110b, G1 = 3 = 0011b and B1 = 8 = 1000b, then the red component of the base color of subblock 1 becomes 11101110b = 238, and the green and blue components become 00110011b = 51 and 10001000b = 136. The base color for subblock 2 is decoded the same way, but using the 4-bit codewords R2 (bit 59-56), G2 (bit 51-48)and B2 (bit 43-40) instead. In summary, the base colors for the subblocks in the individual mode are:

\begin{align*} base\ col\ subblock1 & = extend\_4to8bits(R1, G1, B1) \\ base\ col\ subblock2 & = extend\_4to8bits(R2, G2, B2) \end{align*}

In the ‘differential’ mode (diffbit = 1), the base color for subblock 1 is derived from the five-bit codewords $R1'$ , $G1'$ and $B1'$ . These five-bit codewords are extended to eight bits by replicating the top three highest order bits to the three lowest order bits. For instance, if $R1'$ = 28 = 11100b, the resulting eight-bit red color component becomes 11100111b = 231. Likewise, if $G1'$ = 4 = 00100b and $B1'$ = 3 = 00011b, the green and blue components become 00100001b = 33 and 00011000b = 24 respectively. Thus, in this example, the base color for subblock 1 is (231, 33, 24). The five bit representation for the base color of subblock 2 is obtained by modifying the 5-bit codewords R1' G1' and B1' by the codewords dR2, dG2 and dB2. Each of dR2, dG2 and dB2 is a 3-bit two-complement number that can hold values between $-4$ and $+3$ . For instance, if R1' = 28 as above, and dR2 = 100b = $-4$ , then the five bit representation for the red color component is $28+(-4)=24$ = 11000b, which is then extended to eight bits to 11000110b = 198. Likewise, if G1' = 4, dG2 = 2, B1' = 3 and dB2 = 0, the base color of subblock 2 will be RGB = (198, 49, 24). In summary, the base colors for the subblocks in the differential mode are:

\begin{align*} base\ col\ subblock1 & = extend\_5to8bits(R1', G1', B1') \\ base\ col\ subblock2 & = extend\_5to8bits(R1'+dR2, G1'+dG2, B1'+dB2) \end{align*}

Note that these additions are not allowed to under- or overflow (go below zero or above 31). (The compression scheme can easily make sure they don’t.) For over- or underflowing values, the behavior is undefined for all pixels in the $4\times 4$ block. Note also that the extension to eight bits is performed after the addition.

After obtaining the base color, the operations are the same for the two modes ‘individual’ and ‘differential’. First a table is chosen using the table codewords: For subblock 1, table codeword 1 is used (bits 39-37), and for subblock 2, table codeword 2 is used (bits 36-34), see Table 45, “Texel Data format for ETC1 compressed textures”. The table codeword is used to select one of eight modifier tables, see Table 46, “Intensity modifier sets for ETC1 compressed textures”. For instance, if the table code word is 010b = 2, then the modifier table [ $-29$ , $-9$ , $9$ , $29$ ] is selected. Note that the values in Table 46, “Intensity modifier sets for ETC1 compressed textures” are valid for all textures and can therefore be hardcoded into the decompression unit.

Next, we identify which modifier value to use from the modifier table using the two ‘pixel index’ bits. The pixel index bits are unique for each pixel. For instance, the pixel index for pixel d (see Table 49, “Pixel layout for an ETC1 compressed block”) can be found in bits 19 (most significant bit, MSB), and 3 (least significant bit, LSB), see section c of Table 45, “Texel Data format for ETC1 compressed textures”. Note that the pixel index for a particular texel is always stored in the same bit position, irrespectively of bits diffbit and flipbit. The pixel index bits are decoded using Table 47, “Mapping from pixel index values to modifier values for ETC1 compressed textures”. If, for instance, the pixel index bits are 01b = 1, and the modifier table [ $-29$ , $-9$ , $9$ , $29$ ] is used, then the modifier value selected for that pixel is 29 (see Table 47, “Mapping from pixel index values to modifier values for ETC1 compressed textures”). This modifier value is now used to additively modify the base color. For example, if we have the base color ( $231$ , $8$ , $16$ ), we should add the modifier value 29 to all three components: ( $231+29$ , $8+29$ , $16+29$ ) resulting in ( $260$ , $37$ , $45$ ). These values are then clamped to [ $0$ , $255$ ], resulting in the color ( $255$ , $37$ , $45$ ), and we are finished decoding the texel.

Table 45. Texel Data format for ETC1 compressed textures

a) bit layout in bits 63 through 32 if diffbit = 0

63

62

61

60

59

58

57

56

55

54

53

52

51

50

49

48

base col1

R1 (4 bits)

base col2

R2 (4 bits)

base col1

G1 (4 bits)

base col2

G2 (4 bits)

47

46

45

44

43

42

41

40

39

38

37

36

35

34

33

32

base col1

B1 (4 bits)

base col2

B2 (4 bits)

table

cw 1

table

cw 2

diff

bit

flip

bit

b) bit layout in bits 63 through 32 if diffbit = 1

63

62

61

60

59

58

57

56

55

54

53

52

51

50

49

48

base col1

R1' (5 bits)

dcol 2

dR2

base col1

G1' (4 bits)

dcol 2

dG2

47

46

45

44

43

42

41

40

39

38

37

36

35

34

33

32

base col1

B1' (5 bits)

dcol 2

db2

table

cw 1

table

cw 2

diff

bit

flip

bit

c) bit layout in bits 31 through 0 (in both cases)

31

30

29

28

27

26

25

24

23

22

21

20

19

18

17

16

most significant pixel index bits

p

o

n

m

l

k

j

i

h

g

f

e

d

c

b

a

15

14

13

12

11

10

9

8

7

6

5

4

3

2

1

0

least significant pixel index bits

p

o

n

m

l

k

j

i

h

g

f

e

d

c

b

a


Table 46. Intensity modifier sets for ETC1 compressed textures

Table codeword Modifier table

0

-8

-2

2

8

1

-17

-5

5

17

2

-29

-9

9

29

3

-42

-13

13

42

4

-60

-18

18

60

5

-80

-24

24

80

6

-106

-33

33

106

7

-183

-47

47

183


Table 47. Mapping from pixel index values to modifier values for ETC1 compressed textures

Pixel index value Resulting modifier value

msb

lsb

 

1

1

-b (large negative value)

1

0

-a (small negative value)

0

0

a (small positive value)

0

1

b (large positive value)


Table 48. Pixel layout for an 8×8 texture using four ETC1 compressed blocks. Note how pixel a2 in the second block is adjacent to pixel m1 in the first block.

First block in mem

 

Second block in mem

 

$a_1$

$e_1$

$i_1$

$m_1$

$b_1$

$f_1$

$j_1$

$n_1$

$c_1$

$g_1$

$k_1$

$o_1$

$d_1$

$h_1$

$l_1$

$p_1$

 

$a_2$

$e_2$

$i_2$

$m_2$

$b_2$

$f_2$

$j_2$

$n_2$

$c_2$

$g_2$

$k_2$

$o_2$

$d_2$

$h_2$

$l_2$

$p_2$

$\rightarrow u$ direction

 

 

 

$a_3$

$e_3$

$i_3$

$m_3$

$b_3$

$f_3$

$j_3$

$n_3$

$c_3$

$g_3$

$k_3$

$o_3$

$d_3$

$h_3$

$l_3$

$p_3$

 

$a_4$

$e_4$

$i_4$

$m_4$

$b_4$

$f_4$

$j_4$

$n_4$

$c_4$

$g_4$

$k_4$

$o_4$

$d_4$

$h_4$

$l_4$

$p_4$

Third block in mem

Fourth block in mem

$\downarrow v$ direction


Table 49. Pixel layout for an ETC1 compressed block

$a$

$e$

$i$

$m$

$b$

$f$

$j$

$n$

$c$

$g$

$k$

$o$

$d$

$h$

$l$

$p$


Table 50. Two 2×4-pixel ETC1 subblocks side-by-side

$a$

$e$

$i$

$m$

$b$

$f$

$j$

$n$

$c$

$g$

$k$

$o$

$d$

$h$

$l$

$p$


Table 51. Two 4×2-pixel ETC1 subblocks on top of each other

$a$

$e$

$i$

$m$

$b$

$f$

$j$

$n$

$c$

$g$

$k$

$o$

$d$

$h$

$l$

$p$


14. ETC2 Compressed Texture Image Formats

This description is derived from the “ETC Compressed Texture Image Formats” section of the OpenGL 4.5 specification.

The ETC formats form a family of related compressed texture image formats. They are designed to do different tasks, but also to be similar enough that hardware can be reused between them. Each one is described in detail below, but we will first give an overview of each format and describe how it is similar to others and the main differences.

RGB ETC2 is a format for compressing RGB data. It is a superset of the older ETC1 format. This means that an older ETC1 texture can be decoded using an ETC2-compliant decoder. The main difference is that the newer version contains three new modes; the ‘T-mode’ and the ‘H-mode’ which are good for sharp chrominance blocks and the ‘Planar’ mode which is good for smooth blocks.

RGB ETC2 with sRGB encoding is the same as linear RGB ETC2 with the difference that the values should be interpreted as being encoded with the sRGB transfer function instead of linear RGB-values.

RGBA ETC2 encodes RGBA 8-bit data. The RGB part is encoded exactly the same way as RGB ETC2. The alpha part is encoded separately.

RGBA ETC2 with sRGB encoding is the same as RGBA ETC2 but here the RGB-values (but not the alpha value) should be interpreted as being encoded with the sRGB transfer function.

Unsigned R11 EAC is a one-channel unsigned format. It is similar to the alpha part of RGBA ETC2 but not exactly the same; it delivers higher precision. It is possible to make hardware that can decode both formats with minimal overhead.

Unsigned RG11 EAC is a two-channel unsigned format. Each channel is decoded exactly as R11 EAC.

Signed R11 EAC is a one-channel signed format. This is good in situations when it is important to be able to preserve zero exactly, and still use both positive and negative values. It is designed to be similar enough to Signed R11 EAC so that hardware can decode both with minimal overhead, but it is not exactly the same. For example; the signed version does not add 0.5 to the base codeword, and the extension from 11 bits differ. For all details, see the corresponding sections.

Signed RG11 EAC is a two-channel signed format. Each channel is decoded exactly as signed R11 EAC.

RGB ETC2 with “punchthrough” alpha is very similar to RGB ETC2, but has the ability to represent “punchthrough”-alpha (completely opaque or transparent). Each block can select to be completely opaque using one bit. To fit this bit, there is no individual mode in RGB ETC2 with punchthrough alpha. In other respects, the opaque blocks are decoded as in RGB ETC2. For the transparent blocks, one index is reserved to represent transparency, and the decoding of the RGB channels are also affected. For details, see the corresponding sections.

RGB ETC2 with punchthrough alpha and sRGB encoding is the same as linear RGB ETC2 with punchthrough alpha but the RGB channel values should be interpreted as being encoded with the sRGB transfer function.

A texture compressed using any of the ETC texture image formats is described as a number of $4\times 4$ pixel blocks.

Pixel $a_1$ (see Table 52, “Pixel layout for an 8×8 texture using four ETC2 compressed blocks. Note how pixel a3 in the third block is adjacent to pixel d1 in the first block.”) of the first block in memory will represent the texture coordinate $(u=0, v=0)$ . Pixel $a_2$ in the second block in memory will be adjacent to pixel $m_1$ in the first block, etc. until the width of the texture. Then pixel $a_3$ in the following block (third block in memory for a $8\times 8$ texture) will be adjacent to pixel $d_1$ in the first block, etc. until the height of the texture.

The data storage for an $8\times 8$ texture using the first, second, third and fourth block if stored in that order in memory would have the texels encoded in the same order as a simple linear format as if the bytes describing the pixels came in the following memory order: $a_1$ $e_1$ $i_1$ $m_1$ $a_2$ $e_2$ $i_2$ $m_2$ $b_1$ $f_1$ $j_1$ $n_1$ $b_2$ $f_2$ $j_2$ $n_2$ $c_1$ $g_1$ $k_1$ $o_1$ $c_2$ $g_2$ $k_2$ $o_2$ $d_1$ $h_1$ $l_1$ $p_1$ $d_2$ $h_2$ $l_2$ $p_2$ $a_3$ $e_3$ $i_3$ $m_3$ $a_4$ $e_4$ $i_4$ $m_4$ $b_3$ $f_3$ $j_3$ $n_3$ $b_4$ $f_4$ $j_4$ $n_4$ $c_3$ $g_3$ $k_3$ $o_3$ $c_4$ $g_4$ $k_4$ $o_4$ $d_3$ $h_3$ $l_3$ $p_3$ $d_4$ $h_4$ $l_4$ $p_4$ .

Table 52. Pixel layout for an 8×8 texture using four ETC2 compressed blocks. Note how pixel a3 in the third block is adjacent to pixel d1 in the first block.

First block in mem

 

Second block in mem

 

$a_1$

$e_1$

$i_1$

$m_1$

$b_1$

$f_1$

$j_1$

$n_1$

$c_1$

$g_1$

$k_1$

$o_1$

$d_1$

$h_1$

$l_1$

$p_1$

 

$a_2$

$e_2$

$i_2$

$m_2$

$b_2$

$f_2$

$j_2$

$n_2$

$c_2$

$g_2$

$k_2$

$o_2$

$d_2$

$h_2$

$l_2$

$p_2$

$\rightarrow u$ direction

 

 

 

$a_3$

$e_3$

$i_3$

$m_3$

$b_3$

$f_3$

$j_3$

$n_3$

$c_3$

$g_3$

$k_3$

$o_3$

$d_3$

$h_3$

$l_3$

$p_3$

 

$a_4$

$e_4$

$i_4$

$m_4$

$b_4$

$f_4$

$j_4$

$n_4$

$c_4$

$g_4$

$k_4$

$o_4$

$d_4$

$h_4$

$l_4$

$p_4$

Third block in mem

Fourth block in mem

$\downarrow v$ direction


If the width or height of the texture (or a particular mip-level) is not a multiple of four, then padding is added to ensure that the texture contains a whole number of $4\times 4$ blocks in each dimension. The padding does not affect the texel coordinates. For example, the texel shown as $a_1$ in Table 52, “Pixel layout for an 8×8 texture using four ETC2 compressed blocks. Note how pixel a3 in the third block is adjacent to pixel d1 in the first block.” always has coordinates $i = 0, j = 0$ . The values of padding texels are irrelevant, e.g., in a $3\times 3$ texture, the texels marked as $m_1$ , $n_1$ , $o_1$ , $d_1$ , $h_1$ , $l_1$ and $p_1$ form padding and have no effect on the final texture image.

The number of bits that represent a $4\times 4$ texel block is 64 bits if the format is RGB ETC2, RGB ETC2 with sRGB encoding, RGBA ETC2 with punchthrough alpha, or RGB ETC2 with punchthrough alpha and sRGB encoding.

In those cases the data for a block is stored as a number of bytes, { $q_0$ , $q_1$ , $q_2$ , $q_3$ , $q_4$ , $q_5$ , $q_6$ , $q_7$ }, where byte $q_0$ is located at the lowest memory address and $q_7$ at the highest. The 64 bits specifying the block are then represented by the following 64 bit integer:

\begin{align*} int64bit & = 256\times(256\times(256\times(256\times(256\times(256\times(256\times q_0+q_1)+q_2)+q_3)+q_4)+q_5)+q_6)+q_7 \end{align*}

The number of bits that represent a $4\times 4$ texel block is 128 bits if the format is RGBA ETC2 with a linear or sRGB transfer function. In those cases the data for a block is stored as a number of bytes: { $q_0$ , $q_1$ , $q_2$ , $q_3$ , $q_4$ , $q_5$ , $q_6$ , $q_7$ , $q_8$ , $q_9$ , $q_{10}$ , $q_{11}$ , $q_{12}$ , $q_{13}$ , $q_{14}$ , $q_{15}$ }, where byte $q_0$ is located at the lowest memory address and $q_{15}$ at the highest. This is split into two 64-bit integers, one used for color channel decompression and one for alpha channel decompression:

\begin{align*} int64bitAlpha & = 256\times(256\times(256\times(256\times(256\times(256\times(256\times q_0+q_1)+q_2)+q_3)+q_4)+q_5)+q_6)+q_7 \\ int64bitColor & = 256\times(256\times(256\times(256\times(256\times(256\times(256\times q_8+q_9)+q_{10})+q_{11})+q_{12})+q_{13})+q_{14})+q_{15} \end{align*}

14.1. Format RGB ETC2

For RGB ETC2, each 64-bit word contains information about a three-channel $4 \times 4$ pixel block as shown in Table 53, “Pixel layout for an ETC2 compressed block”.

Table 53. Pixel layout for an ETC2 compressed block

$a$

$e$

$i$

$m$

$\rightarrow u$ direction

$b$

$f$

$j$

$n$

$c$

$g$

$k$

$o$

$d$

$h$

$l$

$p$

$\downarrow v$ direction


Table 54. Texel Data format for ETC2 compressed texture formats

a) location of bits for mode selection

63

62

61

60

59

58

57

56

55

54

53

52

51

50

49

48

47

46

45

44

43

42

41

40

39

38

37

36

35

34

33

32

R

dR

G

dG

B

dB

……

D

.

b) bit layout for bits 63 through 32 for ‘individual’ mode:

63

62

61

60

59

58

57

56

55

54

53

52

51

50

49

48

47

46

45

44

43

42

41

40

39

38

37

36

35

34

33

32

R1

R2

G1

G2

B1

B2

table1

table2

0

$F_B$

c) bit layout for bits 63 through 32 for ‘differential’ mode:

63

62

61

60

59

58

57

56

55

54

53

52

51

50

49

48

47

46

45

44

43

42

41

40

39

38

37

36

35

34

33

32

R

dR

G

dG

B

dB

table1

table2

1

$F_B$

d) bit layout for bits 63 through 32 for ‘T’ mode:

63

62

61

60

59

58

57

56

55

54

53

52

51

50

49

48

47

46

45

44

43

42

41

40

39

38

37

36

35

34

33

32

R1a

.

R1b

G1

B1

R2

G2

B2

da

1

db

e) bit layout for bits 63 through 32 for ‘H’ mode:

63

62

61

60

59

58

57

56

55

54

53

52

51

50

49

48

47

46

45

44

43

42

41

40

39

38

37

36

35

34

33

32

.

R1

G1 a

G1 b

B1 a

.

B1 b

R2

G2

B2

da

1

db

f) bit layout for bits 31 through 0 for ‘individual’, ‘diff’, ‘T’ and ‘H’ modes

31

30

29

28

27

26

25

24

23

22

21

20

19

18

17

16

15

14

13

12

11

10

9

8

7

6

5

4

3

2

1

0

p0

o0

n0

m0

l0

k0

j0

i0

h0

g0

f0

e0

d0

c0

b0

a0

p1

o1

n1

m1

l1

k1

j1

i1

h1

g1

f1

e1

d1

c1

b1

a1

g) bit layout for bits 63 through 0 for ‘planar’ mode:

63

62

61

60

59

58

57

56

55

54

53

52

51

50

49

48

47

46

45

44

43

42

41

40

39

38

37

36

35

34

33

32

.

R O

G O 1

.

G O 2

B O 1

B O 2

.

B O 3

R H 1

1

R H 2

31

30

29

28

27

26

25

24

23

22

21

20

19

18

17

16

15

14

13

12

11

10

9

8

7

6

5

4

3

2

1

0

GH

BH

RV

GV

BV


The blocks are compressed using one of five different ‘modes’. Section a of Table 54, “Texel Data format for ETC2 compressed texture formats” shows the bits used for determining the mode used in a given block. First, if the bit marked ‘D’ is set to 0, the ‘individual’ mode is used. Otherwise, the three 5-bit values R, G and B, and the three 3-bit values dR, dG and dB are examined. R, G and B are treated as integers between $0$ and $31$ and dR, dG and dB as two’s-complement integers between $-4$ and $+3$ . First, R and dR are added, and if the sum is not within the interval $[0,31]$ , the ‘T’ mode is selected. Otherwise, if the sum of G and dG is outside the interval $[0,31]$ , the ‘H’ mode is selected. Otherwise, if the sum of B and dB is outside of the interval $[0,31]$ , the ‘planar’ mode is selected. Finally, if the ‘D’ bit is set to 1 and all of the aforementioned sums lie between $0$ and $31$ , the ‘differential’ mode is selected.

Table 55. Two 2-by-4-pixel ETC2 subblocks side-by-side.

$a$

$e$

$i$

$m$

$b$

$f$

$j$

$n$

$c$

$g$

$k$

$o$

$d$

$h$

$l$

$p$


The layout of the bits used to decode the ‘individual’ and ‘differential’ modes are shown in section b and section c of Table 54, “Texel Data format for ETC2 compressed texture formats”, respectively. Both of these modes share several characteristics. In both modes, the $4 \times 4$ block is split into two subblocks of either size $2\times 4$ or $4\times 2$ . This is controlled by bit 32, which we dub the ‘flip bit’. If the ‘flip bit’ is 0, the block is divided into two $2\times4$ subblocks side-by-side, as shown in Table 55, “Two 2-by-4-pixel ETC2 subblocks side-by-side.”. If the ‘flip bit’ is 1, the block is divided into two $4\times 2$ subblocks on top of each other, as shown in Table 56, “Two 4-by-2-pixel ETC2 subblocks on top of each other.”. In both modes, a ‘base color’ for each subblock is stored, but the way they are stored is different in the two modes:

Table 56. Two 4-by-2-pixel ETC2 subblocks on top of each other.

$a$

$e$

$i$

$m$

$b$

$f$

$j$

$n$

$c$

$g$

$k$

$o$

$d$

$h$

$l$

$p$


In the ‘individual’ mode, following the layout shown in section b of Table 54, “Texel Data format for ETC2 compressed texture formats”, the base color for subblock 1 is derived from the codewords R1 (bit 63—60), G1 (bit 55—52) and B1 (bit 47—44). These four bit values are extended to RGB888 by replicating the four higher order bits in the four lower order bits. For instance, if R1 = $14$ = 1110 binary (1110b for short), G1 = $3$ = 0011b and B1 = $8$ = 1000b, then the red component of the base color of subblock 1 becomes 11101110b = $238$ , and the green and blue components become 00110011b = $51$ and 10001000b = $136$ . The base color for subblock 2 is decoded the same way, but using the 4-bit codewords R2 (bit 59—56), G2 (bit 51—48)and B2 (bit 43—40) instead. In summary, the base colors for the subblocks in the individual mode are:

\begin{align*} base\ col\ subblock1 & = extend\_4to8bits(R1, G1, B1) \\ base\ col\ subblock2 & = extend\_4to8bits(R2, G2, B2) \end{align*}

In the ‘differential’ mode, following the layout shown in section c of Table 54, “Texel Data format for ETC2 compressed texture formats”, the base color for subblock 1 is derived from the five-bit codewords R, G and B. These five-bit codewords are extended to eight bits by replicating the top three highest order bits to the three lowest order bits. For instance, if R = $28$ = 11100b, the resulting eight-bit red color component becomes 11100111b = $231$ . Likewise, if G = $4$ = 00100b and B = $3$ = 00011b, the green and blue components become 00100001b = $33$ and 00011000b = $24$ respectively. Thus, in this example, the base color for subblock 1 is ( $231$ , $33$ , $24$ ). The five-bit representation for the base color of subblock 2 is obtained by modifying the five-bit codewords R G and B by the codewords dR, dG and dB. Each of dR, dG and dB is a 3-bit two’s-complement number that can hold values between $-4$ and $+3$ . For instance, if R = 28 as above, and dR = 100b = $-4$ , then the five bit representation for the red color component is $28+(-4)=24=\:$ 11000b, which is then extended to eight bits to 11000110b = $198$ . Likewise, if G = $4$ , dG = $2$ , B = $3$ and dB = $0$ , the base color of subblock 2 will be RGB = $198$ , $49$ , $24$ . In summary, the base colors for the subblocks in the differential mode are:

\begin{align*} base\ col\ subblock1 & = extend\_5to8bits(R, G, B) \\ base\ col\ subblock2 & = extend\_5to8bits(R+dR, G+dG, B+dB) \end{align*}

Note that these additions will not under- or overflow, or one of the alternative decompression modes would have been chosen instead of the ‘differential’ mode.

After obtaining the base color, the operations are the same for the two modes ‘individual’ and ‘differential’. First a table is chosen using the table codewords: For subblock 1, table codeword 1 is used (bits 39—37), and for subblock 2, table codeword 2 is used (bits 36—34), see section b or section c of Table 54, “Texel Data format for ETC2 compressed texture formats”. The table codeword is used to select one of eight modifier tables, see Table 57, “ETC2 intensity modifier sets for ‘individual’ and ‘differential’ modes”. For instance, if the table code word is 010 binary = 2, then the modifier table $-29$ , $-9$ , $9$ , $29$ is selected for the corresponding sub-block. Note that the values in Table 57, “ETC2 intensity modifier sets for ‘individual’ and ‘differential’ modes” are valid for all textures and can therefore be hardcoded into the decompression unit.

Table 57. ETC2 intensity modifier sets for ‘individual’ and ‘differential’ modes

Table codeword Modifier table

0

-8

-2

2

8

1

-17

-5

5

17

2

-29

-9

9

29

3

-42

-13

13

42

4

-60

-18

18

60

5

-80

-24

24

80

6

-106

-33

33

106

7

-183

-47

47

183


Table 58. Mapping from pixel index values to modifier values for RGB ETC2 compressed textures.

Pixel index value Resulting modifier value

msb

lsb

1

1

-b (large negative value)

1

0

-a (small negative value)

0

0

a (small positive value)

0

1

b (large positive value)


Next, we identify which modifier value to use from the modifier table using the two ‘pixel index’ bits. The pixel index bits are unique for each pixel. For instance, the pixel index for pixel d (see Table 53, “Pixel layout for an ETC2 compressed block”) can be found in bits 19 (most significant bit, MSB), and 3 (least significant bit, LSB), see section f of Table 54, “Texel Data format for ETC2 compressed texture formats”. Note that the pixel index for a particular texel is always stored in the same bit position, irrespectively of bits ‘diffbit’ and ‘flipbit’. The pixel index bits are decoded using Table 58, “Mapping from pixel index values to modifier values for RGB ETC2 compressed textures.”. If, for instance, the pixel index bits are 01 binary = 1, and the modifier table $-29$ , $-9$ , $9$ , $29$ is used, then the modifier value selected for that pixel is 29 (see Table 58, “Mapping from pixel index values to modifier values for RGB ETC2 compressed textures.”). This modifier value is now used to additively modify the base color. For example, if we have the base color ( $231$ , $8$ , $16$ ), we should add the modifier value $29$ to all three components: ( $231+29$ , $8+29$ , $16+29$ ) resulting in ( $260$ , $37$ , $45$ ). These values are then clamped to [ $0$ , $255$ ], resulting in the color ( $255$ , $37$ , $45$ ), and we are finished decoding the texel.

The ‘T’ and ‘H’ compression modes also share some characteristics: both use two base colors stored using 4 bits per channel decoded as in the individual mode. Unlike the ‘individual’ mode however, these bits are not stored sequentially, but in the layout shown in section d and section e of Table 54, “Texel Data format for ETC2 compressed texture formats”. To clarify, in the ‘T’ mode, the two colors are constructed as follows:

\begin{align*} base\ col\ 1 & = extend\_4to8bits((R1a \ll 2)\ |\ R1b, G1, B1) \\ base\ col\ 2 & = extend\_4to8bits(R2, G2, B2) \end{align*}

Here, $\ll$ denotes bit-wise left shift and $|$ denotes bit-wise OR. In the ‘H’ mode, the two colors are constructed as follows:

\begin{align*} base\ col\ 1 & = extend\_4to8bits(R1, (G1a \ll 1)\ |\ G1b, (B1a \ll 3)\ |\ B1b) \\ base\ col\ 2 & = extend\_4to8bits(R2, G2, B2) \end{align*}

Both the ‘T’ and ‘H’ modes have four ‘paint colors’ which are the colors that will be used in the decompressed block, but they are assigned in a different manner. In the ‘T’ mode, ‘paint color 0’ is simply the first base color, and ‘paint color 2’ is the second base color. To obtain the other ‘paint colors’, a ‘distance’ is first determined, which will be used to modify the luminance of one of the base colors. This is done by combining the values ‘da’ and ‘db’ shown in section d of Table 54, “Texel Data format for ETC2 compressed texture formats” by $(da\ll 1)\ |\ db$ , and then using this value as an index into the small look-up table shown in Table 59, “Distance table for ETC2 ‘T’ and ‘H’ modes.”. For example, if ‘da’ is 10 binary and ‘db’ is 1 binary, the index is 101 binary and the selected distance will be $32$ . ‘Paint color 1’ is then equal to the second base color with the ‘distance’ added to each channel, and ‘paint color 3’ is the second base color with the ‘distance’ subtracted.

Table 59. Distance table for ETC2 ‘T’ and ‘H’ modes.

Distance index Distance

0

3

1

6

2

11

3

16

4

23

5

32

6

41

7

64


In summary, to determine the four ‘paint colors’ for a ‘T’ block:

\begin{align*} paint\ color\ 0 & = base\ col\ 1 \\ paint\ color\ 1 & = base\ col\ 2 + (d, d, d) \\ paint\ color\ 2 & = base\ col\ 2 \\ paint\ color\ 3 & = base\ col\ 2 - (d, d, d) \end{align*}

In both cases, the value of each channel is clamped to within [ $0$ , $255$ ].

A ‘distance’ value is computed for the ‘H’ mode as well, but doing so is slightly more complex. In order to construct the three-bit index into the distance table shown in Table 59, “Distance table for ETC2 ‘T’ and ‘H’ modes.”, ‘da’ and ‘db’ shown in section e of Table 54, “Texel Data format for ETC2 compressed texture formats” are used as the most significant bit and middle bit, respectively, but the least significant bit is computed as (base col 1 value $\geq$ base col 2 value), the ‘value’ of a color for the comparison being equal to $(R\ll 16)+(G\ll 8)+B$ . Once the ‘distance’ d has been determined for an ‘H’ block, the four ‘paint colors’ will be:

\begin{align*} paint\ color\ 0 & = base\ col\ 1 + (d, d, d) \\ paint\ color\ 1 & = base\ col\ 1 - (d, d, d) \\ paint\ color\ 2 & = base\ col\ 2 + (d, d, d) \\ paint\ color\ 3 & = base\ col\ 2 - (d, d, d) \end{align*}

Again, all color components are clamped to within [ $0$ , $255$ ]. Finally, in both the ‘T’ and ‘H’ modes, every pixel is assigned one of the four ‘paint colors’ in the same way the four modifier values are distributed in ‘individual’ or ‘differential’ blocks. For example, to choose a paint color for pixel d, an index is constructed using bit 19 as most significant bit and bit 3 as least significant bit. Then, if a pixel has index 2, for example, it will be assigned paint color 2.

The final mode possible in an RGB ETC2-compressed block is the ‘planar’ mode. Here, three base colors are supplied and used to form a color plane used to determine the color of the individual pixels in the block.

All three base colors are stored in RGB 676 format, and stored in the manner shown in section g of Table 54, “Texel Data format for ETC2 compressed texture formats”. The three colors are there labelled ‘O’, ‘H’ and ‘V’, so that the three components of color ‘V’ are RV, GV and BV, for example. Some color channels are split into non-consecutive bit-ranges, for example BO is reconstructed using BO1 as the most significant bit, BO2 as the two following bits, and BO3 as the three least significant bits.

Once the bits for the base colors have been extracted, they must be extended to 8 bits per channel in a manner analogous to the method used for the base colors in other modes. For example, the 6-bit blue and red channels are extended by replicating the two most significant of the six bits to the two least significant of the final 8 bits.

With three base colors in RGB888 format, the color of each pixel can then be determined as:

\begin{align*} R(x,y) & = {x\times (RH-RO)\over 4.0} + {y\times (RV-RO)\over 4.0} + RO \\ G(x,y) & = {x\times (GH-GO)\over 4.0} + {y\times (GV-GO)\over 4.0} + GO \\ B(x,y) & = {x\times (BH-BO)\over 4.0} + {y\times (BV-BO)\over 4.0} + BO \end{align*}

where $x$ and $y$ are values from $0$ to $3$ corresponding to the pixels coordinates within the block, $x$ being in the $u$ direction and $y$ in the $v$ direction. For example, the pixel $g$ in Table 53, “Pixel layout for an ETC2 compressed block” would have $x=1$ and $y=2$ .

These values are then rounded to the nearest integer (to the larger integer if there is a tie) and then clamped to a value between $0$ and $255$ . Note that this is equivalent to

\begin{align*} R(x,y) & = clamp255((x\times (RH-RO) + y\times (RV-RO) + 4\times RO + 2) \gg 2) \\ G(x,y) & = clamp255((x\times (GH-GO) + y\times (GV-GO) + 4\times GO + 2) \gg 2) \\ B(x,y) & = clamp255((x\times (BH-BO) + y\times (BV-BO) + 4\times BO + 2) \gg 2) \end{align*}

where $clamp255$ clamps the value to a number in the range [ $0$ , $255$ ] and where $\gg$ performs bit-wise right shift.

This specification gives the output for each compression mode in 8-bit integer colors between $0$ and $255$ , and these values all need to be divided by $255$ for the final floating point representation.

14.2. Format RGB ETC2 with sRGB encoding

Decompression of floating point sRGB values in RGB ETC2 with sRGB encoding follows that of floating point RGB values of linear RGB ETC2. The result is sRGB-encoded values between $0.0$ and $1.0$ . The further conversion from an sRGB encoded component, $cs$ , to a linear component, $cl$ , is done according to the formulae in Section 18.3, “sRGB transfer functions”. Assume $cs$ is the sRGB component in the range [ $0$ , $1$ ].

14.3. Format RGBA ETC2

Each $4 \times 4$ block of RGBA8888 information is compressed to 128 bits. To decode a block, the two 64-bit integers int64bitAlpha and int64bitColor are calculated as described in Section 14.1, “Format RGB ETC2”. The RGB component is then decoded the same way as for RGB ETC2 (see Section 14.1, “Format RGB ETC2”), using int64bitColor as the int64bit codeword.

Table 60. Texel Data format for alpha part of RGBA ETC2 compressed textures

a) bit layout in bits 63 through 48

63

62

61

60

59

58

57

56

55

54

53

52

51

50

49

48

base_codeword

multiplier

table index

b) bit layout in bits 47 through 0, with pixels as name in Table 53, “Pixel layout for an ETC2 compressed block”, bits labelled from 0 being the LSB to 47 being the MSB.

47

46

45

44

43

42

41

40

39

38

37

36

35

34

33

32

a0

a1

a2

b0

b1

b2

c0

c1

c2

d0

d1

d2

e0

e1

e2

f0

31

30

29

28

27

26

25

24

23

22

21

20

19

18

17

16

f1

f2

g0

g1

g2

h0

h1

h2

i0

i1

i2

j0

j1

j2

k0

k1

15

14

13

12

11

10

9

8

7

6

5

4

3

2

1

0

k2

l0

l1

l2

m0

m1

m2

n0

n1

n2

o0

o1

o2

p0

p1

p2


The 64-bits in int64bitAlpha used to decompress the alpha channel are laid out as shown in Table 60, “Texel Data format for alpha part of RGBA ETC2 compressed textures”. The information is split into two parts. The first 16 bits comprise a base codeword, a table codeword and a multiplier, which are used together to compute 8 pixel values to be used in the block. The remaining 48 bits are divided into 16 3-bit indices, which are used to select one of these 8 possible values for each pixel in the block.

The decoded value of a pixel is a value between $0$ and $255$ and is calculated the following way:

Equation 1. ETC2-base

\begin{align*} clamp255((base\_codeword) + modifier\times multiplier) \end{align*}

where $clamp255(\cdot)$ maps values outside the range [ $0$ , $255$ ] to $0.0$ or $255.0$ .

The $base\_codeword$ is stored in the first 8 bits (bits 63—56) as shown in Table 60, “Texel Data format for alpha part of RGBA ETC2 compressed textures” part (a). This is the first term in Equation ETC2-base.

Next, we want to obtain the modifier. Bits 51—48 in Table 60, “Texel Data format for alpha part of RGBA ETC2 compressed textures” part (a) form a 4-bit index used to select one of 16 pre-determined ‘modifier tables’, shown in Table 61, “Intensity modifier sets for RGBA ETC2 alpha component”.

Table 61. Intensity modifier sets for RGBA ETC2 alpha component

table index modifier table

0

-3

-6

-9

-15

2

5

8

14

1

-3

-7

-10

-13

2

6

9

12

2

-2

-5

-8

-13

1

4

7

12

3

-2

-4

-6

-13

1

3

5

12

4

-3

-6

-8

-12

2

5

7

11

5

-3

-7

-9

-11

2

6

8

10

6

-4

-7

-8

-11

3

6

7

10

7

-3

-5

-8

-11

2

4

7

10

8

-2

-6

-8

-10

1

5

7

9

9

-2

-5

-8

-10

1

4

7

9

10

-2

-4

-8

-10

1

3

7

9

11

-2

-5

-7

-10

1

4

6

9

12

-3

-4

-7

-10

2

3

6

9

13

-1

-2

-3

-10

0

1

2

9

14

-4

-6

-8

-9

3

5

7

8

15

-3

-5

-7

-9

2

4

6

8


For example, a table index of $13$ (1101 binary) means that we should use table [ $-1$ , $-2$ , $-3$ , $-10$ , $0$ , $1$ , $2$ , $9$ ]. To select which of these values we should use, we consult the pixel index of the pixel we want to decode. As shown in Table 60, “Texel Data format for alpha part of RGBA ETC2 compressed textures” part (b), bits 47—0 are used to store a 3-bit index for each pixel in the block, selecting one of the 8 possible values. Assume we are interested in pixel $b$ . Its pixel indices are stored in bit 44—42, with the most significant bit stored in 44 and the least significant bit stored in 42. If the pixel index is 011 binary = $3$ , this means we should take the value 3 from the left in the table, which is $-10$ . This is now our modifier, which is the starting point of our second term in the addition.

In the next step we obtain the multiplier value; bits 55—52 form a four-bit ‘multiplier’ between $0$ and $15$ . This value should be multiplied with the modifier. An encoder is not allowed to produce a multiplier of zero, but the decoder should still be able to handle also this case (and produce $0\times$ modifier $= 0$ in that case).

The modifier times the multiplier now provides the third and final term in the sum in Equation ETC2-base. The sum is calculated and the value is clamped to the interval [ $0$ , $255$ ]. The resulting value is the 8-bit output value.

For example, assume a base_codeword of 103, a ‘table index’ of $13$ , a pixel index of 3 and a multiplier of 2. We will then start with the base codeword $103$ (01100111 binary). Next, a ‘table index’ of $13$ selects table $-1$ , $-2$ , $-3$ , $-10$ , $0$ , $1$ , $2$ , $9$ , and using a pixel index of 3 will result in a modifier of $-10$ . The multiplier is $2$ , forming $-10\times 2 = -20$ . We now add this to the base value and get $103-20 = 83$ . After clamping we still get $83$ = 01010011 binary. This is our 8-bit output value.

This specification gives the output for each channel in 8-bit integer values between $0$ and $255$ , and these values all need to be divided by $255$ to obtain the final floating point representation.

Note that hardware can be effectively shared between the alpha decoding part of this format and that of R11 EAC texture. For details on how to reuse hardware, see Section 14.5, “Format Unsigned R11 EAC”.

14.4. Format RGBA ETC2 with sRGB encoding

Decompression of floating point sRGB values in RGBA ETC2 with sRGB encoding follows that of floating point RGB values of linear RGBA ETC2. The result is sRGB values between $0.0$ and $1.0$ . The further conversion from an sRGB encoded component, $cs$ , to a linear component, $cl$ , is according to the formula in Section 18.3, “sRGB transfer functions”. Assume $cs$ is the sRGB component in the range [0,1].

The alpha component of RGBA ETC2 with sRGB encoding is done in the same way as for linear RGBA ETC2.

14.5. Format Unsigned R11 EAC

The number of bits to represent a $4\times 4$ texel block is 64 bits. if format is R11 EAC. In that case the data for a block is stored as a number of bytes, $\{q_0, q_1, q_2, q_3, q_4, q_5, q_6, q_7\}$ , where byte $q_0$ is located at the lowest memory address and $q_7$ at the highest. The red component of the $4\times 4$ block is then represented by the following 64 bit integer:

\begin{align*} int64bit & = 256\times(256\times(256\times(256\times(256\times(256\times(256\times q_0+q_1)+q_2)+q_3)+q_4)+q_5)+q_6)+q_7 \end{align*}

This 64-bit word contains information about a single-channel $4\times 4$ pixel block as shown in Table 53, “Pixel layout for an ETC2 compressed block”. The 64-bit word is split into two parts. The first 16 bits comprise a base codeword, a table codeword and a multiplier. The remaining 48 bits are divided into 16 3-bit indices, which are used to select one of the 8 possible values for each pixel in the block, as shown in Table 60, “Texel Data format for alpha part of RGBA ETC2 compressed textures”.

The decoded value is calculated as:

\begin{align*} clamp1((base\_codeword+0.5)\times \frac{1}{255.875} + modifier\times multiplier \times \frac{1}{255.875}) \end{align*}

where $clamp1(\cdot)$ maps values outside the range [0.0, 1.0] to 0.0 or 1.0.

We will now go into detail how the decoding is done. The result will be an 11-bit fixed point number where $0$ represents $0.0$ and $2047$ represents $1.0$ . This is the exact representation for the decoded value. However, some implementations may use, e.g., 16-bits of accuracy for filtering. In such a case the 11-bit value will be extended to 16 bits in a predefined way, which we will describe later.

To get a value between $0$ and $2047$ we must multiply Equation R11 EAC-start by $2047.0$ :

\begin{align*} clamp2((base\_codeword+0.5)\times \frac{2047.0}{255.875} + modifier\times multiplier\times\frac{2047.0}{255.875}) \end{align*}

where $clamp2(\cdot)$ clamps to the range [ $0.0$ , $2047.0$ ]. Since $2047.0 \over 255.875$ is exactly $8.0$ , the above equation can be written as

Equation 2. Equation R11 EAC simple

\begin{align*} clamp2(base\_codeword \times 8 + 4 + modifier \times multiplier \times 8) \end{align*}

The base_codeword is stored in the first 8 bits as shown in Table 60, “Texel Data format for alpha part of RGBA ETC2 compressed textures” part (a). Bits 63—56 in each block represent an eight-bit integer (base_codeword) which is multiplied by 8 by shifting three steps to the left. We can add $4$ to this value without addition logic by just inserting 100 binary in the last three bits after the shift. For example, if base_codeword is $129$ = 10000001 binary (or 10000001b for short), the shifted value is 10000001000b and the shifted value including the $+4$ term is 10000001100b $= 1036 = 129 \times 8+4$ . Hence we have summed together the first two terms of the sum in Equation R11 EAC simple.

Next, we want to obtain the modifier. Bits 51-48 form a 4-bit index used to select one of 16 pre-determined ‘modifier tables’, shown in Table 61, “Intensity modifier sets for RGBA ETC2 alpha component”. For example, a table index of $13$ (1101 binary) means that we should use table [ $-1$ , $-2$ , $-3$ , $-10$ , $0$ , $1$ , $2$ , $9$ ]. To select which of these values we should use, we consult the pixel index of the pixel we want to decode. Bits 47—0 are used to store a 3-bit index for each pixel in the block, selecting one of the 8 possible values. Assume we are interested in pixel $b$ . Its pixel indices are stored in bit 44—42, with the most significant bit stored in 44 and the least significant bit stored in 42. If the pixel index is 011 binary = $3$ , this means we should take the value $3$ from the left in the table, which is $-10$ . This is now our modifier, which is the starting point of our second term in the sum.

In the next step we obtain the multiplier value; bits 55—52 form a four-bit ‘multiplier’ between $0$ and $15$ . We will later treat what happens if the multiplier value is zero, but if it is nonzero, it should be multiplied width the modifier. This product should then be shifted three steps to the left to implement the $\times 8$ multiplication. The result now provides the third and final term in the sum in Equation R11 EAC simple. The sum is calculated and the result is clamped to a value in the interval [ $0$ , $2047$ ]. The resulting value is the 11-bit output value.

For example, assume a base_codeword of $103$ , a ‘table index’ of $13$ , a pixel index of $3$ and a multiplier of $2$ . We will then first multiply the base_codeword $103$ (01100111b) by $8$ by left-shifting it (0110111000b) and then add $4$ resulting in 0110111100b $= 828 = 103\times 8+4$ . Next, a ‘table index’ of $13$ selects table [ $-1$ , $-2$ , $-3$ , $-10$ , $0$ , $1$ , $2$ , $9$ ], and using a pixel index of 3 will result in a modifier of $-10$ . The multiplier is nonzero, which means that we should multiply it with the modifier, forming $-10\times 2 = -20 =\:$ 111111101100b. This value should in turn be multiplied by 8 by left-shifting it three steps: 111101100000b $\:= -160$ . We now add this to the base value and get $828-160 = 668$ . After clamping we still get $668 =$ 01010011100b. This is our 11-bit output value, which represents the value ${668 \over 2047} = 0.32633121 \ldots$

If the multiplier_value is zero (i.e., the multiplier bits 55—52 are all zero), we should set the multiplier to $1.0\over 8.0$ . Equation-r11eac-eqn-simple can then be simplified to

\begin{align*} clamp2(base\_codeword\times 8 + 4 + modifier) \end{align*}

As an example, assume a base_codeword of 103, a ‘table index’ of 13, a pixel index of 3 and a multiplier_value of 0. We treat the base_codeword the same way, getting $828 = 103\times 8+4$ . The modifier is still -10. But the multiplier should now be $1 \over 8$ , which means that third term becomes $-10\times (1\over 8)\times 8 = -10$ . The sum therefore becomes $828-10 = 818$ . After clamping we still get $818 =\:$ 01100110010b, and this is our 11-bit output value, and it represents ${818 \over 2047} = 0.39960918 \ldots$

Some OpenGL ES implementations may find it convenient to use 16-bit values for further processing. In this case, the 11-bit value should be extended using bit replication. An 11-bit value x is extended to 16 bits through $(x\ll 5) + (x \gg 6)$ . For example, the value $668 =\:$ 01010011100b should be extended to 0101001110001010b $\: = 21386$ .

In general, the implementation may extend the value to any number of bits that is convenient for further processing, e.g., 32 bits. In these cases, bit replication should be used. On the other hand, an implementation is not allowed to truncate the 11-bit value to less than 11 bits.

Note that the method does not have the same reconstruction levels as the alpha part in the RGBA ETC2 format. For instance, for a base_value of $255$ and a table_value of $0$ , the alpha part of the RGBA ETC2 format will represent a value of ${(255+0)\over 255.0} = 1.0$ exactly. In R11 EAC the same base_value and table_value will instead represent ${(255.5+0)\over 255.875} = 0.99853444 \ldots$ That said, it is still possible to decode the alpha part of the RGBA ETC2-format using R11 EAC hardware. This is done by truncating the 11-bit number to 8 bits. As an example, if base_value = $255$ and table_value = $0$ , we get the 11-bit value $(255\times 8+4+0)$ = $2044$ = 1111111100b, which after truncation becomes the 8-bit value 11111111b = $255$ which is exactly the correct value according to RGBA ETC2. Clamping has to be done to $0$ , $255$ after truncation for RGBA ETC2 decoding. Care must also be taken to handle the case when the multiplier value is zero. In the 11-bit version, this means multiplying by $1 \over 8$ , but in the 8-bit version, it really means multiplication by $0$ . Thus, the decoder will have to know if it is an RGBA ETC2 texture or an R11 EAC texture to decode correctly, but the hardware can be 100\% shared.

As stated above, a base\_value of 255 and a table\_value of $0$ will represent a value of ${(255.5+0) \over 255.875} = 0.99853444 \ldots$ , and this does not reach $1.0$ even though $255$ is the highest possible base_codeword. However, it is still possible to reach a pixel value of $1.0$ since a modifier other than $0$ can be used. Indeed, half of the modifiers will often produce a value of $1.0$ . As an example, assume we choose the base_value $255$ , a multiplier of $1$ and the modifier table [ $-3$ $-5$ $-7$ $-9$ $2$ $4$ $6$ $8$ ]. Starting with-r11eac-eqn-simple,

\begin{align*} clamp1((base\_codeword+0.5)\times \frac{1}{255.875} + table\_value \times multiplier \times \frac{1}{255.875}) \end{align*}

we get

\begin{align*} clamp1((255+0.5)\times \frac{1}{255.875} + \left[ \begin{array}{cccccccc} -3 & -5 & -7 &-9 & 2 & 4 & 6 & 8 \end{array}\right] \times \frac{1}{255.875}) \end{align*}

which equals

\begin{align*} clamp1(\left[ \begin{array}{cccccccc} 0.987 & 0.979 & 0.971 & 0.963 & 1.00 & 1.01 & 1.02 & 1.03 \end{array}\right]) \end{align*}

or after clamping

\begin{align*} \left[ \begin{array}{cccccccc} 0.987 & 0.979 & 0.971 & 0.963 & 1.00 & 1.00 & 1.00 & 1.00\end{array}\right] \end{align*}

which shows that several values can be $1.0$ , even though the base value does not reach $1.0$ . The same reasoning goes for $0.0$ .

14.6. Format Unsigned RG11 EAC

The number of bits to represent a $4\times 4$ texel block is 128 bits if the format is RG11 EAC. In that case the data for a block is stored as a number of bytes, $\{q_0, q_1, q_2, q_3, q_4, q_5, q_6, q_7, p_0, p_1, p_2, p_3, p_4, p_5, p_6, p_7\}$ where byte $q_0$ is located at the lowest memory address and $p_7$ at the highest. The 128 bits specifying the block are then represented by the following two 64 bit integers:

\begin{align*} \mathrm{int64bit0} & = 256\times (256\times (256\times (256\times (256\times (256\times (256\times q_0+q_1)+q_2)+q_3)+q_4)+q_5)+q_6)+q_7 \\ \mathrm{int64bit1} & = 256\times (256\times (256\times (256\times (256\times (256\times (256\times p_0+p_1)+p_2)+p_3)+p_4)+p_5)+p_6)+p_7 \end{align*}

The 64-bit word int64bit0 contains information about the red component of a two-channel 4x4 pixel block as shown in Table 53, “Pixel layout for an ETC2 compressed block”, and the word int64bit1 contains information about the green component. Both 64-bit integers are decoded in the same way as R11 EAC described in Section-r11eac-r11eac.

14.7. Format Signed R11 EAC

The number of bits to represent a $4\times 4$ texel block is 64 bits if the format is signed R11 EAC. In that case the data for a block is stored as a number of bytes, $\{q_0, q_1, q_2, q_3, q_4, q_5, q_6, q_7\}$ , where byte $q_0$ is located at the lowest memory address and $q_7$ at the highest. The red component of the $4\times 4$ block is then represented by the following 64 bit integer:

\begin{align*} \mathrm{int64bit} & = 256\times(256\times(256\times(256\times(256\times(256\times(256\times q_0+q_1)+q_2)+q_3)+q_4)+q_5)+q_6)+q_7 \end{align*}

This 64-bit word contains information about a single-channel $4\times 4$ pixel block as shown in Table 53, “Pixel layout for an ETC2 compressed block”. The 64-bit word is split into two parts. The first 16 bits comprise a base codeword, a table codeword and a multiplier. The remaining 48 bits are divided into 16 3-bit indices, which are used to select one of the 8 possible values for each pixel in the block, as shown in Table 60, “Texel Data format for alpha part of RGBA ETC2 compressed textures”.

The decoded value is calculated as

Equation 3. R11-start

\begin{align*} clamp1(base\_codeword\times \frac{1}{127.875} + \textit{modifier}\times \textit{multiplier}\times \frac{1}{127.875}) \end{align*}

where $clamp1(\cdot)$ maps values outside the range [ $-1.0$ , $1.0$ ] to $-1.0$ or $1.0$ . We will now go into detail how the decoding is done. The result will be an 11-bit two’s-complement fixed point number where $-1023$ represents $-1.0$ and $1023$ represents $1.0$ . This is the exact representation for the decoded value. However, some implementations may use, e.g., 16-bits of accuracy for filtering. In such a case the 11-bit value will be extended to 16 bits in a predefined way, which we will describe later.

To get a value between $-1023$ and $1023$ we must multiply Equation R11-start by $1023.0$ :

\begin{align*} clamp2(base\_codeword\times \frac{1023.0}{127.875} + modifier\times multiplier\times \frac{1023.0}{127.875}) \end{align*}

where clamp2(.) clamps to the range [ $-1023.0$ , $1023.0$ ]. Since $1023.0\over 127.875$ is exactly $8$ , the above formula can be written as:

\begin{align*} clamp2(base\_codeword\times 8 + modifier\times multiplier \times 8) \end{align*}

The base_codeword is stored in the first 8 bits as shown in Table 60, “Texel Data format for alpha part of RGBA ETC2 compressed textures” part (a). It is a two’s-complement value in the range [ $-127$ , $127$ ], and where the value $-128$ is not allowed; however, if it should occur anyway it must be treated as $-127$ . The base_codeword is then multiplied by $8$ by shifting it left three steps. For example the value $65$ = 01000001 binary (or 01000001b for short) is shifted to 01000001000b $\:= 520 = 65\times 8$ .

Next, we want to obtain the modifier. Bits 51—48 form a 4-bit index used to select one of 16 pre-determined ‘modifier tables’, shown in Table 61, “Intensity modifier sets for RGBA ETC2 alpha component”. For example, a table index of $13$ (1101 binary) means that we should use table $-1$ , $-2$ , $-3$ , $-10$ , $0$ , $1$ , $2$ , $9$ . To select which of these values we should use, we consult the pixel index of the pixel we want to decode. Bits 47—0 are used to store a 3-bit index for each pixel in the block, selecting one of the 8 possible values. Assume we are interested in pixel $b$ . Its pixel indices are stored in bit 44—42, with the most significant bit stored in 44 and the least significant bit stored in 42. If the pixel index is 011 binary = $3$ , this means we should take the value $3$ from the left in the table, which is $-10$ . This is now our modifier, which is the starting point of our second term in the sum.

In the next step we obtain the multiplier value; bits 55-52 form a four-bit ‘multiplier’ between $0$ and $15$ . We will later treat what happens if the multiplier value is zero, but if it is nonzero, it should be multiplied with the modifier. This product should then be shifted three steps to the left to implement the $\times 8$ multiplication. The result now provides the third and final term in the sum in Equation-signedr11eac-eqn-simple. The sum is calculated and the result is clamped to a value in the interval [ $-1023$ , $1023$ ]. The resulting value is the 11-bit output value.

For example, assume a a base_codeword of $60$ , a ‘table index’ of $13$ , a pixel index of $3$ and a multiplier of $2$ . We start by multiplying the base_codeword (00111100b) by $8$ using bit shift, resulting in (00111100000b) $\:= 480 = 60\times 8$ . Next, a ‘table index’ of $13$ selects table [ $-1$ , $-2$ , $-3$ , $-10$ , $0$ , $1$ , $2$ , $9$ ], and using a pixel index of $3$ will result in a modifier of $-10$ . The multiplier is nonzero, which means that we should multiply it with the modifier, forming $-10\times 2$ = $-20$ = 111111101100b. This value should in turn be multiplied by $8$ by left-shifting it three steps: 111101100000b = $-160$ . We now add this to the base value and get $480-160 = 320$ . After clamping we still get $320$ = 00101000000b. This is our 11-bit output value, which represents the value ${320\over 1023} = 0.31280547 \ldots$ .

If the multiplier_value is zero (i.e., the multiplier bits 55-52 are all zero), we should set the multiplier to $1.0 \over 8.0$ . Equation-signedr11eac-eqn-simple can then be simplified to:

\begin{align*} clamp2(base\_codeword \times 8 + modifier) \end{align*}

As an example, assume a base_codeword of 65, a ‘table index’ of $13$ , a pixel index of $3$ and a multiplier_value of $0$ . We treat the base_codeword the same way, getting $480 = 60\times 8$ . The modifier is still $-10$ . But the multiplier should now be $1 \over 8$ , which means that third term becomes $-10\times({1 \over 8})\times 8 = -10$ . The sum therefore becomes $480-10 = 470$ . Clamping does not affect the value since it is already in the range [ $-1023$ , $1023$ ], and the 11-bit output value is therefore $470$ = 00111010110b. This represents ${470\over 1023} = 0.45943304 \dots$

Some OpenGL ES implementations may find it convenient to use two’s-complement 16-bit values for further processing. In this case, a positive 11-bit value should be extended using bit replication on all the bits except the sign bit. An 11-bit value x is extended to 16 bits through $(x \ll 5) + (x \gg 5)$ . Since the sign bit is zero for a positive value, no addition logic is needed for the bit replication in this case. For example, the value $470$ = 00111010110b in the above example should be expanded to 0011101011001110b = $15054$ . A negative 11-bit value must first be made positive before bit replication, and then made negative again:

if (result11bit >= 0) {
  result16bit = (result11bit << 5) + (result11bit >> 5);
} else {
  result11bit = -result11bit;
  result16bit = (result11bit << 5) + (result11bit >> 5);
  result16bit = -result16bit;
}

Simply bit replicating a negative number without first making it positive will not give a correct result.

In general, the implementation may extend the value to any number of bits that is convenient for further processing, e.g., 32 bits. In these cases, bit replication according to the above should be used. On the other hand, an implementation is not allowed to truncate the 11-bit value to less than 11 bits.

Note that it is not possible to specify a base value of $1.0$ or $-1.0$ . The largest possible base_codeword is $+127$ , which represents ${127 \over 127.875} = 0.993\ldots$ However, it is still possible to reach a pixel value of $1.0$ or $-1.0$ , since the base value is modified by the table before the pixel value is calculated. Indeed, half of the modifiers will often produce a value of $1.0$ . As an example, assume the base_codeword is $+127$ , the modifier table is [ $-3$ $-5$ $-7$ $-9$ $2$ $4$ $6$ $8$ ] and the multiplier is one. Starting with Equation-signedr11eac-eqn-start,

\begin{align*} base\_codeword\times \frac{1}{127.875} + modifier\times multiplier\times \frac{1}{127.875} \end{align*}

we get

\begin{align*} \frac{127}{127.875} + \left[\begin{array}{cccccccc} -3 & -5 & -7 & -9 & 2 & 4 & 6 & 8 \end{array}\right] \times \frac{1}{127.875} \end{align*}

which equals

\begin{align*} \left[ \begin{array}{cccccccc} 0.970 & 0.954 & 0.938 & 0.923 & 1.01 & 1.02 & 1.04 &1.06\end{array}\right] \end{align*}

or after clamping

\begin{align*} \left[ \begin{array}{cccccccc} 0.970 & 0.954 & 0.938 & 0.923 & 1.00 & 1.00 & 1.00 & 1.00 \end{array}\right] \end{align*}

This shows that it is indeed possible to arrive at the value $1.0$ . The same reasoning goes for $-1.0$ .

Note also that Equations-signedr11eac-eqn-simple/signedr11eac-eqn-simpler are very similar to Equations-r11eac-eqn-simple/r11eac-eqn-simpler in the unsigned version EAC_R11. Apart from the $+4$ , the clamping and the extension to bit sizes other than 11, the same decoding hardware can be shared between the two codecs.

14.8. Format Signed RG11 EAC

The number of bits to represent a $4\times 4$ texel block is 128 bits if the format is signed RG11 EAC. In that case the data for a block is stored as a number of bytes, $\{q_0, q_1, q_2, q_3, q_4, q_5, q_6, q_7, p_0, p_1, p_2, p_3, p_4, p_5, p_6, p_7\}$ where byte $q_0$ is located at the lowest memory address and $p_7$ at the highest. The 128 bits specifying the block are then represented by the following two 64 bit integers:

\begin{align*} \mathrm{int64bit0} & = 256\times (256\times (256\times (256\times (256\times (256\times (256\times q_0+q_1)+q_2)+q_3)+q_4)+q_5)+q_6)+q_7 \\ \mathrm{int64bit1} & = 256\times (256\times (256\times (256\times (256\times (256\times (256\times p_0+p_1)+p_2)+p_3)+p_4)+p_5)+p_6)+p_7 \end{align*}

The 64-bit word int64bit0 contains information about the red component of a two-channel $4\times 4$ pixel block as shown in Table 53, “Pixel layout for an ETC2 compressed block”, and the word int64bit1 contains information about the green component. Both 64-bit integers are decoded in the same way as signed R11 EAC described in Section 14.8, “Format Signed RG11 EAC”.

14.9. Format RGB ETC2 with punchthrough alpha

For RGB ETC2 with punchthrough alpha, each 64-bit word contains information about a four-channel $4 \times 4$ pixel block as shown in Table 53, “Pixel layout for an ETC2 compressed block”.

The blocks are compressed using one of four different ‘modes’. Table 62, “Texel Data format for punchthrough alpha ETC2 compressed texture formats” part (a) shows the bits used for determining the mode used in a given block.

Table 62. Texel Data format for punchthrough alpha ETC2 compressed texture formats

a) location of bits for mode selection

63

62

61

60

59

58

57

56

55

54

53

52

51

50

49

48

47

46

45

44

43

42

41

40

39

38

37

36

35

34

33

32

R

dR

G

dG

B

dB

……

Op

.

b) bit layout for bits 63 through 32 for ‘differential’ mode:

63

62

61

60

59

58

57

56

55

54

53

52

51

50

49

48

47

46

45

44

43

42

41

40

39

38

37

36

35

34

33

32

R

dR

G

dG

B

dB

table1

table2

Op

$F_B$

c) bit layout for bits 63 through 32 for ‘T’ mode:

63

62

61

60

59

58

57

56

55

54

53

52

51

50

49

48

47

46

45

44

43

42

41

40

39

38

37

36

35

34

33

32

R1a

.

R1b

G1

B1

R2

G2

B2

da

Op

db

d) bit layout for bits 63 through 32 for ‘H’ mode:

63

62

61

60

59

58

57

56

55

54

53

52

51

50

49

48

47

46

45

44

43

42

41

40

39

38

37

36

35

34

33

32

.

R1

G1 a

G1 b

B1 a

.

B1 b

R2

G2

B2

da

Op

db

e) bit layout for bits 31 through 0 for ‘diff’, ‘T’ and ‘H’ modes

31

30

29

28

27

26

25

24

23

22

21

20

19

18

17

16

15

14

13

12

11

10

9

8

7

6

5

4

3

2

1

0

p0

o0

n0

m0

l0

k0

j0

i0

h0

g0

f0

e0

d0

c0

b0

a0

p1

o1

n1

m1

l1

k1

j1

i1

h1

g1

f1

e1

d1

c1

b1

a1

f) bit layout for bits 63 through 0 for ‘planar’ mode:

63

62

61

60

59

58

57

56

55

54

53

52

51

50

49

48

47

46

45

44

43

42

41

40

39

38

37

36

35

34

33

32

.

R O

G O 1

.

G O 2

B O 1

B O 2

.

B O 3

R H 1

1

R H 2

31

30

29

28

27

26

25

24

23

22

21

20

19

18

17

16

15

14

13

12

11

10

9

8

7

6

5

4

3

2

1

0

GH

BH

RV

GV

BV


To determine the mode, the three 5-bit values R, G and B, and the three 3-bit values dR, dG and dB are examined. R, G and B are treated as integers between 0 and 31 and dR, dG and dB as two’s-complement integers between $-4$ and $+3$ . First, R and dR are added, and if the sum is not within the interval [ $0$ , $31$ ], the ‘T’ mode is selected. Otherwise, if the sum of G and dG is outside the interval [ $0$ , $31$ ], the ‘H’ mode is selected. Otherwise, if the sum of B and dB is outside of the interval [ $0$ , $31$ ], the ‘planar’ mode is selected. Finally, if all of the aforementioned sums lie between $0$ and $31$ , the ‘differential’ mode is selected.

The layout of the bits used to decode the ‘differential’ mode is shown in Table 62, “Texel Data format for punchthrough alpha ETC2 compressed texture formats” part (b). In this mode, the $4 \times 4$ block is split into two subblocks of either size $2\times 4$ or $4\times 2$ . This is controlled by bit 32, which we dub the ‘flip bit’. If the ‘flip bit’ is 0, the block is divided into two $2\times 4$ subblocks side-by-side, as shown in Table 55, “Two 2-by-4-pixel ETC2 subblocks side-by-side.”. If the ‘flip bit’ is 1, the block is divided into two $4\times 2$ subblocks on top of each other, as shown in Table 56, “Two 4-by-2-pixel ETC2 subblocks on top of each other.”. For each subblock, a ‘base color’ is stored.

In the ‘differential’ mode, following the layout shown in Table 62, “Texel Data format for punchthrough alpha ETC2 compressed texture formats” part (b), the base color for subblock 1 is derived from the five-bit codewords R, G and B. These five-bit codewords are extended to eight bits by replicating the top three highest order bits to the three lowest order bits. For instance, if R = $28$ = 11100 binary (11100b for short), the resulting eight-bit red color component becomes 11100111b = $231$ . Likewise, if G = $4$ = 00100b and B = $3$ = 00011b, the green and blue components become 00100001b = $33$ and 00011000b = $24$ respectively. Thus, in this example, the base color for subblock 1 is ( $231$ , $33$ , $24$ ). The five bit representation for the base color of subblock 2 is obtained by modifying the 5-bit codewords R, G and B by the codewords dR, dG and dB. Each of dR, dG and dB is a 3-bit two’s-complement number that can hold values between $-4$ and $+3$ . For instance, if R = $28$ as above, and dR $=$ 100b $= -4$ , then the five bit representation for the red color component is $28+(-4)=24 =\:$ 11000b, which is then extended to eight bits to 11000110b = $198$ . Likewise, if G = $4$ , dG = $2$ , B = $3$ and dB = $0$ , the base color of subblock 2 will be RGB = ( $198$ , $49$ , $24$ ). In summary, the base colors for the subblocks in the differential mode are:

\begin{align*} \mathit{base}\:\mathit{col}\:\mathit{subblock1} & = \mathit{extend\_5to8bits}(R, G, B) \\ \mathit{base}\:\mathit{col}\:\mathit{subblock2} & = \mathit{extend\_5to8bits}(R+dR, G+dG, B+dB) \end{align*}

Note that these additions will not under- or overflow, or one of the alternative decompression modes would have been chosen instead of the ‘differential’ mode.

Table 63. ETC2 intensity modifier sets for the ‘differential’ if ‘opaque’ is set.

table codeword modifier table

0

-8

-2

2

8

1

-17

-5

5

17

2

-29

-9

9

29

3

-42

-13

13

42

4

-60

-18

18

60

5

-80

-24

24

80

6

-106

-33

33

106

7

-183

-47

47

183


Table 64. ETC2 intensity modifier sets for the ‘differential’ if ‘opaque’ is unset.

table codeword modifier table

0

-8

0

0

8

1

-17

0

0

17

2

-29

0

0

29

3

-42

0

0

42

4

-60

0

0

60

5

-80

0

0

80

6

-106

0

0

106

7

-183

0

0

183


After obtaining the base color, a table is chosen using the table codewords: For subblock 1, table codeword 1 is used (bits 39—37), and for subblock 2, table codeword 2 is used (bits 36—34), see Table 62, “Texel Data format for punchthrough alpha ETC2 compressed texture formats” part (b). The table codeword is used to select one of eight modifier tables. If the ‘opaque’-bit (bit 33) is set, Table 63, “ETC2 intensity modifier sets for the ‘differential’ if ‘opaque’ is set.” is used. If it is unset, Table 64, “ETC2 intensity modifier sets for the ‘differential’ if ‘opaque’ is unset.” is used. For instance, if the ‘opaque’-bit is 1 and the table code word is 010 binary = 2, then the modifier table [ $-29$ , $-9$ , $9$ , $29$ ] is selected for the corresponding sub-block. Note that the values in Table 63, “ETC2 intensity modifier sets for the ‘differential’ if ‘opaque’ is set.” and Table 64, “ETC2 intensity modifier sets for the ‘differential’ if ‘opaque’ is unset.” are valid for all textures and can therefore be hardcoded into the decompression unit.

Next, we identify which modifier value to use from the modifier table using the two ‘pixel index’ bits. The pixel index bits are unique for each pixel. For instance, the pixel index for pixel d (see Table 53, “Pixel layout for an ETC2 compressed block”) can be found in bits 19 (most significant bit, MSB), and 3 (least significant bit, LSB), see Table 62, “Texel Data format for punchthrough alpha ETC2 compressed texture formats” part (e). Note that the pixel index for a particular texel is always stored in the same bit position, irrespectively of the ‘flipbit’.

If the ‘opaque’-bit (bit 33) is set, the pixel index bits are decoded using Table 65, “ETC2 mapping from pixel index values to modifier values when ‘opaque’-bit is set.”. If the ‘opaque’-bit is unset, Table 66, “ETC2 mapping from pixel index values to modifier values when ‘opaque’-bit is unset.” will be used instead. If, for instance, the ‘opaque’-bit is 1, and the pixel index bits are 01 binary = $1$ , and the modifier table [ $-29$ , $-9$ , $9$ , $29$ ] is used, then the modifier value selected for that pixel is $29$ (see Table 65, “ETC2 mapping from pixel index values to modifier values when ‘opaque’-bit is set.”). This modifier value is now used to additively modify the base color. For example, if we have the base color ( $231$ , $8$ , $16$ ), we should add the modifier value $29$ to all three components: ( $231+29$ , $8+29$ , $16+29$ ) resulting in ( $260$ , $37$ , $45$ ). These values are then clamped to [ $0$ , $255$ ], resulting in the color ( $255$ , $37$ , $45$ ).

Table 65. ETC2 mapping from pixel index values to modifier values when ‘opaque’-bit is set.

Pixel index value Resulting modifier value

msb

lsb

1

1

-b (large negative value)

1

0

-a (small negative value)

0

0

a (small positive value)

0

1

b (large positive value)


Table 66. ETC2 mapping from pixel index values to modifier values when ‘opaque’-bit is unset.

Pixel index value Resulting modifier value

msb

lsb

1

1

-b (large negative value)

1

0

0 (zero)

0

0

0 (zero)

0

1

b (large positive value)


The alpha component is decoded using the ‘opaque’-bit, which is positioned in bit 33 (see Table 62, “Texel Data format for punchthrough alpha ETC2 compressed texture formats” part (b)). If the ‘opaque’-bit is set, alpha is always $255$ . However, if the ‘opaque’-bit is zero, the alpha-value depends on the pixel indices; if MSB==1 and LSB==0, the alpha value will be zero, otherwise it will be $255$ . Finally, if the alpha value equals $0$ , the red-, green- and blue components will also be zero.

if (opaque == 0 && MSB == 1 && LSB == 0) {
  red = 0;
  green = 0;
  blue = 0;
  alpha = 0;
} else {
  alpha = 255;
}

Hence paint color 2 will equal RGBA = ( $0$ , $0$ , $0$ , $0$ ) if opaque == 0.

In the example above, assume that the ‘opaque’-bit was instead 0. Then, since the MSB = 0 and LSB 1, alpha will be $255$ , and the final decoded RGBA-tuple will be ( $255$ , $37$ , $45$ , $255$ ).

The ‘T’ and ‘H’ compression modes share some characteristics: both use two base colors stored using 4 bits per channel. These bits are not stored sequentially, but in the layout shown in Table 62, “Texel Data format for punchthrough alpha ETC2 compressed texture formats” part (c) and Table 62, “Texel Data format for punchthrough alpha ETC2 compressed texture formats” part (d). To clarify, in the ‘T’ mode, the two colors are constructed as follows:

\begin{align*} \mathit{base}\:\mathit{col}\:\mathit{1} & = \mathit{extend\_4to8bits}(\: (R1a \ll 2)\: | \: R1b, \: G1, \: B1) \\\ \mathit{base}\:\mathit{col}\:\mathit{2} & =\mathit{extend\_4to8bits}(R2, G2, B2) \end{align*}

In the ‘H’ mode, the two colors are constructed as follows:

\begin{align*} \mathit{base}\:\mathit{col}\:\mathit{1} & = \mathit{extend\_4to8bits}(R1,\: (G1a \ll 1) \: | \: G1b,\: (B1a \ll 3)\: | \: B1b) \\ \mathit{base}\:\mathit{col}\:\mathit{2} & = \mathit{extend\_4to8bits}(R2, G2, B2) \end{align*}

The function extend_4to8bits() just replicates the four bits twice. This is equivalent to multiplying by $17$ . As an example, extend_4to8bits(1101b) equals 11011101b = $221$ .

Both the ‘T’ and ‘H’ modes have four ‘paint colors’ which are the colors that will be used in the decompressed block, but they are assigned in a different manner. In the ‘T’ mode, ‘paint color 0’ is simply the first base color, and ‘paint color 2’ is the second base color. To obtain the other ‘paint colors’, a ‘distance’ is first determined, which will be used to modify the luminance of one of the base colors. This is done by combining the values ‘da’ and ‘db’ shown in Table 62, “Texel Data format for punchthrough alpha ETC2 compressed texture formats” part (c) by $(da\ll 1)|db$ , and then using this value as an index into the small look-up table shown in Table 59, “Distance table for ETC2 ‘T’ and ‘H’ modes.”. For example, if ‘da’ is 10 binary and ‘db’ is 1 binary, the index is 101 binary and the selected distance will be 32. ‘Paint color 1’ is then equal to the second base color with the ‘distance’ added to each channel, and ‘paint color 3’ is the second base color with the ‘distance’ subtracted. In summary, to determine the four ‘paint colors’ for a ‘T’ block:

\begin{align*} \mathit{paint}\:\mathit{color}\:\mathit{0} & = \mathit{base}\:\mathit{col}\:\mathit{1} \\ \mathit{paint}\:\mathit{color}\:\mathit{1} & = \mathit{base}\:\mathit{col}\:\mathit{2} + (d, d, d) \\ \mathit{paint}\:\mathit{color}\:\mathit{2} & = \mathit{base}\:\mathit{col}\:\mathit{2} \\ \mathit{paint}\:\mathit{color}\:\mathit{3} & = \mathit{base}\:\mathit{col}\:\mathit{2} - (d, d, d) \end{align*}

In both cases, the value of each channel is clamped to within [ $0$ , $255$ ].

Just as for the differential mode, the RGB channels are set to zero if alpha is zero, and the alpha component is calculated the same way:

if (opaque == 0 && MSB == 1 && LSB == 0) {
  red = 0;
  green = 0;
  blue = 0;
  alpha = 0;
} else {
  alpha = 255;
}

A ‘distance’ value is computed for the ‘H’ mode as well, but doing so is slightly more complex. In order to construct the three-bit index into the distance table shown in Table 59, “Distance table for ETC2 ‘T’ and ‘H’ modes.”, ‘da’ and ‘db’ shown in Table 62, “Texel Data format for punchthrough alpha ETC2 compressed texture formats” part (d) are used as the most significant bit and middle bit, respectively, but the least significant bit is computed as (base col 1 value $\geq$ base col 2 value), the ‘value’ of a color for the comparison being equal to $(R\ll 16)+(G\ll 8)+B$ . Once the ‘distance’ d has been determined for an ‘H’ block, the four ‘paint colors’ will be:

\begin{align*} \mathit{paint}\:\mathit{color}\:\mathit{0} & = \mathit{base}\:\mathit{col}\:\mathit{1} + (d, d, d) \\ \mathit{paint}\:\mathit{color}\:\mathit{1} & = \mathit{base}\:\mathit{col}\:\mathit{1} - (d, d, d) \\ \mathit{paint}\:\mathit{color}\:\mathit{2} & = \mathit{base}\:\mathit{col}\:\mathit{2} + (d, d, d) \\ \mathit{paint}\:\mathit{color}\:\mathit{3} & = \mathit{base}\:\mathit{col}\:\mathit{2} - (d, d, d) \end{align*}

Yet again, RGB is zeroed if alpha is 0 and the alpha component is determined the same way:

if (opaque == 0 && MSB == 1 && LSB == 0) {
  red = 0;
  green = 0;
  blue = 0;
  alpha = 0;
} else {
  alpha = 255;
}

Hence paint color 2 will have R=G=B=alpha=0 if opaque == 0.

Again, all color components are clamped to within [ $0$ , $255$ ]. Finally, in both the ‘T’ and ‘H’ modes, every pixel is assigned one of the four ‘paint colors’ in the same way the four modifier values are distributed in ‘individual’ or ‘differential’ blocks. For example, to choose a paint color for pixel d, an index is constructed using bit 19 as most significant bit and bit 3 as least significant bit. Then, if a pixel has index 2, for example, it will be assigned paint color 2.

The final mode possible in an RGB ETC2 with punchthrough alpha — compressed block is the ‘planar’ mode. In this mode, the ‘opaque’-bit must be 1 (a valid encoder should not produce an ‘opaque’-bit equal to 0 in the planar mode), but should the ‘opaque’-bit anyway be 0 the decoder should treat it as if it were 1. In the ‘planar’ mode, three base colors are supplied and used to form a color plane used to determine the color of the individual pixels in the block.

All three base colors are stored in RGB 676 format, and stored in the manner shown in Table 62, “Texel Data format for punchthrough alpha ETC2 compressed texture formats” part (f). The three colors are there labelled ‘O’, ‘H’ and ‘V’, so that the three components of color ‘V’ are RV, GV and BV, for example. Some color channels are split into non-consecutive bit-ranges, for example BO is reconstructed using BO1 as the most significant bit, BO2 as the two following bits, and BO3 as the three least significant bits.

Once the bits for the base colors have been extracted, they must be extended to 8 bits per channel in a manner analogous to the method used for the base colors in other modes. For example, the 6-bit blue and red channels are extended by replicating the two most significant of the six bits to the two least significant of the final 8 bits.

With three base colors in RGB888 format, the color of each pixel can then be determined as:

\begin{align*} R(x,y) & = {x\times (RH-RO) \over 4.0} + {y\times(RV-RO) \over 4.0} + RO \\ G(x,y) & = {x\times (GH-GO) \over 4.0} + {y\times(GV-GO) \over 4.0} + GO \\ B(x,y) & = {x\times (BH-BO) \over 4.0} + {y\times(BV-BO) \over 4.0} + BO \\ A(x,y) & = 255 \end{align*}

where $x$ and $y$ are values from $0$ to $3$ corresponding to the pixels coordinates within the block, $x$ being in the $u$ direction and $y$ in the $v$ direction. For example, the pixel $g$ in Table 53, “Pixel layout for an ETC2 compressed block” would have $x=1$ and $y=2$ .

These values are then rounded to the nearest integer (to the larger integer if there is a tie) and then clamped to a value between $0$ and $255$ . Note that this is equivalent to

\begin{align*} R(x,y) & = clamp255((x\times (RH-RO) + y\times (RV-RO) + 4\times RO + 2) \gg 2) \\ G(x,y) & = clamp255((x\times (GH-GO) + y\times (GV-GO) + 4\times GO + 2) \gg 2) \\ B(x,y) & = clamp255((x\times (BH-BO) + y\times (BV-BO) + 4\times BO + 2) \gg 2) \\ A(x,y) & = 255 \end{align*}

where $clamp255$ clamps the value to a number in the range [ $0$ , $255$ ].

Note that the alpha component is always $255$ in the planar mode.

This specification gives the output for each compression mode in 8-bit integer colors between $0$ and $255$ , and these values all need to be divided by $255$ for the final floating point representation.

14.10. Format RGB ETC2 with punchthrough alpha and sRGB encoding

Decompression of floating point sRGB values in RGB ETC2 with sRGB encoding and punchthrough alpha follows that of floating point RGB values of RGB ETC2 with punchthrough alpha. The result is sRGB values between $0.0$ and $1.0$ . The further conversion from an sRGB encoded component, $cs$ , to a linear component, $cl$ , is according to the formula in Section 18.3, “sRGB transfer functions”. Assume $cs$ is the sRGB component in the range [ $0$ , $1$ ]. Note that the alpha component is not gamma corrected, and hence does not use the above formula.

15. ASTC Compressed Texture Image Formats

This description is derived from the Khronos OES_texture_compression_astc OpenGL extension.

15.1. What is ASTC?

ASTC stands for Adaptive Scalable Texture Compression. The ASTC formats form a family of related compressed texture image formats. They are all derived from a common set of definitions.

ASTC textures may be either 2D or 3D.

ASTC textures may be encoded using either high or low dynamic range. Low dynamic range images may optionally be specified using the sRGB transfer function for the RGB channels.

Two sub-profiles (“LDR Profile” and “HDR Profile”) may be implemented, which support only 2D images at low or high dynamic range respectively.

ASTC textures may be encoded as 1, 2, 3 or 4 components, but they are all decoded into RGBA. ASTC has a variable block size.

15.2. Design Goals

The design goals for the format are as follows:

  • Random access. This is a must for any texture compression format.
  • Bit exact decode. This is a must for conformance testing and reproducibility.
  • Suitable for mobile use. The format should be suitable for both desktop and mobile GPU environments. It should be low bandwidth and low in area.
  • Flexible choice of bit rate. Current formats only offer a few bit rates, leaving content developers with only coarse control over the size/quality tradeoff.
  • Scalable and long-lived. The format should support existing R, RG, RGB and RGBA image types, and also have high “headroom”, allowing continuing use for several years and the ability to innovate in encoders. Part of this is the choice to include HDR and 3D.
  • Feature orthogonality. The choices for the various features of the format are all orthogonal to each other. This has three effects: first, it allows a large, flexible configuration space; second, it makes that space easier to understand; and third, it makes verification easier.
  • Best in class at given bit rate. It should beat or match the current best in class for peak signal-to-noise ratio (PSNR) at all bit rates.
  • Fast decode. Texel throughput for a cached texture should be one texel decode per clock cycle per decoder. Parallel decoding of several texels from the same block should be possible at incremental cost.
  • Low bandwidth. The encoding scheme should ensure that memory access is kept to a minimum, cache reuse is high and memory bandwidth for the format is low.
  • Low area. It must occupy comparable die size to competing formats.

15.3. Basic Concepts

ASTC is a block-based lossy compression format. The compressed image is divided into a number of blocks of uniform size, which makes it possible to quickly determine which block a given texel resides in.

Each block has a fixed memory footprint of 128 bits, but these bits can represent varying numbers of texels (the block “footprint”).

[Note]

The term “block footprint” in ASTC refers to the same concept as “compressed texel block dimensions” elsewhere in the Data Format Specification.

Block footprint sizes are not confined to powers-of-two, and are also not confined to be square. They may be 2D, in which case the block dimensions range from 4 to 12 texels, or 3D, in which case the block dimensions range from 3 to 6 texels.

Decoding one texel requires only the data from a single block. This simplifies cache design, reduces bandwidth and improves encoder throughput.

15.4. Block Encoding

To understand how the blocks are stored and decoded, it is useful to start with a simple example, and then introduce additional features.

The simplest block encoding starts by defining two color “endpoints”. The endpoints define two colors, and a number of additional colors are generated by interpolating between them. We can define these colors using 1, 2, 3, or 4 components (usually corresponding to R, RG, RGB and RGBA textures), and using low or high dynamic range.

We then store a color interpolant weight for each texel in the image, which specifies how to calculate the color to use. From this, a weighted average of the two endpoint colors is used to generate the intermediate color, which is the returned color for this texel.

There are several different ways of specifying the endpoint colors, and the weights, but once they have been defined, calculation of the texel colors proceeds identically for all of them. Each block is free to choose whichever encoding scheme best represents its color endpoints, within the constraint that all the data fits within the 128 bit block.

For blocks which have a large number of texels (e.g. a $12\times 12$ block), there is not enough space to explicitly store a weight for every texel. In this case, a sparser grid with fewer weights is stored, and interpolation is used to determine the effective weight to be used for each texel position. This allows very low bit rates to be used with acceptable quality. This can also be used to more efficiently encode blocks with low detail, or with strong vertical or horizontal features.

For blocks which have a mixture of disparate colors, a single line in the color space is not a good fit to the colors of the pixels in the original image. It is therefore possible to partition the texels into multiple sets, the pixels within each set having similar colors. For each of these “partitions”, we specify separate endpoint pairs, and choose which pair of endpoints to use for a particular texel by looking up the partition index from a partitioning pattern table. In ASTC, this partition table is actually implemented as a function.

The endpoint encoding for each partition is independent.

For blocks which have uncorrelated channels — for example an image with a transparency mask, or an image used as a normal map — it may be necessary to specify two weights for each texel. Interpolation between the components of the endpoint colors can then proceed independently for each “plane” of the image. The assignment of channels to planes is selectable.

Since each of the above options is independent, it is possible to specify any combination of channels, endpoint color encoding, weight encoding, interpolation, multiple partitions and single or dual planes.

Since these values are specified per block, it is important that they are represented with the minimum possible number of bits. As a result, these values are packed together in ways which can be difficult to read, but which are nevertheless highly amenable to hardware decode.

All of the values used as weights and color endpoint values can be specified with a variable number of bits. The encoding scheme used allows a fine-grained tradeoff between weight bits and color endpoint bits using “integer sequence encoding”. This can pack adjacent values together, allowing us to use fractional numbers of bits per value.

Finally, a block may be just a single color. This is a so-called “void extent block” and has a special coding which also allows it to identify nearby regions of single color. This may be used to short-circuit fetching of what would be identical blocks, and further reduce memory bandwidth.

15.5. LDR and HDR Modes

The decoding process for LDR content can be simplified if it is known in advance that sRGB output is required. This selection is therefore included as part of the global configuration.

The two modes differ in various ways, as shown in Table 67, “ASTC differences between LDR and HDR modes”.

Table 67. ASTC differences between LDR and HDR modes

Operation LDR Mode HDR Mode

Returned Value

Vector of FP16, or vector of 8-bit unsigned normalized values

Vector of FP16 values

sRGB compatible

Yes

No

LDR endpoint decoding precision

16 bits, or 8 bits for sRGB

16 bits

HDR endpoint mode results

Error color

As decoded

Error results

Error color

Vector of NaNs (0xFFFF)


The error color is opaque fully-saturated magenta $(R,G,B,A) = (0xFF,0x00,0xFF,0xFF)$ . This has been chosen as it is much more noticeable than black or white, and occurs far less often in valid images.

For linear RGB decode, the error color may be either opaque fully-saturated magenta $(R,G,B,A) = (1.0,0.0,1.0,1.0)$ or a vector of four $NaN$ s $(R,G,B,A) = (NaN,NaN,NaN,NaN)$ . In the latter case, the recommended $NaN$ value returned is $0xFFFF$ .

The error color is returned as an informative response to invalid conditions, including invalid block encodings or use of reserved endpoint modes.

Future, forward-compatible extensions to ASTC may define valid interpretations of these conditions, which will decode to some other color. Therefore, encoders and applications must not rely on invalid encodings as a way of generating the error color.

15.6. Configuration Summary

The global configuration data for the format are as follows:

  • Block dimension (2D or 3D)
  • Block footprint size
  • sRGB output enabled or not

The data specified per block are as follows:

  • Texel weight grid size
  • Texel weight range
  • Texel weight values
  • Number of partitions
  • Partition pattern index
  • Color endpoint modes (includes LDR or HDR selection)
  • Color endpoint data
  • Number of planes
  • Plane-to-channel assignment

15.7. Decode Procedure

To decode one texel:

(Optimization: If within known void-extent, immediately return single color)

Find block containing texel
Read block mode
If void-extent block, store void extent and immediately return single color

For each plane in image
  If block mode requires infill
    Find and decode stored weights adjacent to texel, unquantize and interpolate
  Else
    Find and decode weight for texel, and unquantize

Read number of partitions
If number of partitions > 1
  Read partition table pattern index
  Look up partition number from pattern

Read color endpoint mode and endpoint data for selected partition
Unquantize color endpoints
Interpolate color endpoints using weight (or weights in dual-plane mode)
Return interpolated color

15.8. Block Determination and Bit Rates

The block footprint is a global setting for any given texture, and is therefore not encoded in the individual blocks.

For 2D textures, the block footprint’s width and height are selectable from a number of predefined sizes, namely 4, 5, 6, 8, 10 and 12 pixels.

For square and nearly-square blocks, this gives the bit rates in Table 68, “ASTC 2D footprint and bit rates”.

Table 68. ASTC 2D footprint and bit rates

Footprint

Bit Rate

Increment

Width

Height

4

4

8.00

125%

5

4

6.40

125%

5

5

5.12

120%

6

5

4.27

120%

6

6

3.56

114%

8

5

3.20

120%

8

6

2.67

105%

10

5

2.56

120%

10

6

2.13

107%

8

8

2.00

125%

10

8

1.60

125%

10

10

1.28

120%

12

10

1.07

120%

12

12

0.89


The “Increment” column indicates the ratio of bit rate against the next lower available rate. A consistent value in this column indicates an even spread of bit rates.

For 3D textures, the block footprint’s width, height and depth are selectable from a number of predefined sizes, namely 3, 4, 5, and 6 pixels.

For cubic and near-cubic blocks, this gives the bit rates in Table 69, “ASTC 3D footprint and bit rates”.

Table 69. ASTC 3D footprint and bit rates

Block Footprint

Bit Rate

Increment

Width

Height

Depth

3

3

3

4.74

133%

4

3

3

3.56

133%

4

4

3

2.67

133%

4

4

4

2.00

125%

5

4

4

1.60

125%

5

5

4

1.28

125%

5

5

5

1.02

120%

6

5

5

0.85

120%

6

6

5

0.71

120%

6

6

6

0.59


The full profile supports only those block footprints listed in Table 68, “ASTC 2D footprint and bit rates” and Table 69, “ASTC 3D footprint and bit rates”. Other block sizes are not supported.

For images which are not an integer multiple of the block size, additional texels are added to the edges with maximum X and Y (and Z for 3D textures). These texels may be any color, as they will not be accessed.

Although these are not all powers of two, it is possible to calculate block addresses and pixel addresses within the block, for legal image sizes, without undue complexity.

Given an image which is $W \times H \times D$ pixels in size, with block size $w \times h \times d$ , the size of the image in blocks is:

\begin{align*} B_w & = \left\lceil { W \over w } \right\rceil \\ B_h & = \left\lceil { H \over h } \right\rceil \\ B_d & = \left\lceil { D \over d } \right\rceil \end{align*}

For a 3D image built from 2D slices, each 2D slice is a single texel thick, so that for an image which is $W \times H \times D$ pixels in size, with block size $w \times h$ , the size of the image in blocks is:

\begin{align*} B_w & = \left\lceil { W \over w } \right\rceil \\ B_h & = \left\lceil { H \over h } \right\rceil \\ B_d & = D \end{align*}

15.9. Block Layout

Each block in the image is stored as a single 128-bit block in memory. These blocks are laid out in raster order, starting with the block at (0,0,0), then ordered sequentially by X, Y and finally Z (if present). They are aligned to 128-bit boundaries in memory.

The bits in the block are labeled in little-endian order — the byte at the lowest address contains bits 0..7. Bit 0 is the least significant bit in the byte.

Each block has the same basic layout, shown in Table 70, “ASTC block layout”.

Table 70. ASTC block layout

127

126

125

124

123

122

121

120

119

118

117

116

115

114

113

112

Texel weight data (variable width)

Fill direction $\rightarrow$

111

110

109

108

107

106

105

104

103

102

101

100

99

98

97

96

Texel weight data

95

94

93

92

91

90

89

88

87

86

85

84

83

82

81

80

Texel weight data

79

78

77

76

75

74

73

72

71

70

69

68

67

66

65

64

Texel weight data

63

62

61

60

59

58

57

56

55

54

53

52

51

50

49

48

More config data

47

46

45

44

43

42

41

40

39

38

37

36

35

34

33

32

$\leftarrow$ Fill direction

Color endpoint data

31

30

29

28

27

26

25

24

23

22

21

20

19

18

17

16

Extra configuration data

15

14

13

12

11

10

9

8

7

6

5

4

3

2

1

0

Extra

Part

Block mode


Since the size of the “texel weight data” field is variable, the positions shown for the “more config data” field and “color endpoint data” field are only representative and not fixed.

The “Block mode” field specifies how the Texel Weight Data is encoded.

The “Part” field specifies the number of partitions, minus one. If dual plane mode is enabled, the number of partitions must be 3 or fewer. If 4 partitions are specified, the error value is returned for all texels in the block.

The size and layout of the extra configuration data depends on the number of partitions, and the number of planes in the image, as shown in Table 71, “ASTC single-partition block layout” (only the bottom 32 bits are shown).