Description

The following table describes the list of supported functions that allow you to read and write vector types from a pointer to memory. We use the generic type gentype to indicate the built-in data types char, uchar, short, ushort, int, uint, long, ulong, float or double. We use the generic type name gentypen to represent n-element vectors of gentype elements. We use the type name halfn to represent n-element vectors of half elements37. The suffix n is also used in the function names (i.e. vloadn, vstoren etc.), where n = 2, 3, 4, 8 or 16.

[37] The halfn type is only defined by the cl_khr_fp16 extension described in section 9.4 of the OpenCL 2.0 Extension Specification.

Table 1. Vector Data Load and Store Functions38

Function

Description

gentypen vloadn(size_t offset, const gentype *p)
gentypen vloadn(size_t offset, const {constant} gentype *p)

Return sizeof(gentypen) bytes of data, where the first (n * sizeof(gentype)) bytes are read from the address computed as (p + (offset * n)). The computed address must be 8-bit aligned if gentype is char or uchar; 16-bit aligned if gentype is short or ushort; 32-bit aligned if gentype is int, uint, or float; and 64-bit aligned if gentype is long or ulong.

void vstoren(gentypen data, size_t offset, gentype *p)

Write n * sizeof(gentype) bytes given by data to the address computed as (p + (offset * n)). The computed address must be 8-bit aligned if gentype is char or uchar; 16-bit aligned if gentype is short or ushort; 32-bit aligned if gentype is int, uint, or float; and 64-bit aligned if gentype is long or ulong.

float vload_half(size_t offset, const half *p)
float vload_half(size_t offset, const {constant} half *p)

Read sizeof(half) bytes of data from the address computed as (p + offset). The data read is interpreted as a half value. The half value is converted to a float value and the float value is returned. The computed read address must be 16-bit aligned.

floatn vload_halfn(size_t offset, const half *p)
floatn vload_halfn(size_t offset, const {constant} half *p)

Read (n * sizeof(half)) bytes of data from the address computed as (p + (offset * n)). The data read is interpreted as a halfn value. The halfn value read is converted to a floatn value and the floatn value is returned. The computed read address must be 16-bit aligned.

void vstore_half(float data, size_t offset, half *p)
void vstore_half{rte}(float data, size_t offset, half *p)
void vstore_half{rtz}(float data, size_t offset, half *p)
void vstore_half{rtp}(float data, size_t offset, half *p)
void vstore_half{rtn}(float data, size_t offset, half *p)

The float value given by data is first converted to a half value using the appropriate rounding mode. The half value is then written to the address computed as (p + offset). The computed address must be 16-bit aligned.

vstore_half uses the default rounding mode. The default rounding mode is round to nearest even.

void vstore_halfn(floatn data, size_t offset, half *p)
void vstore_halfn{rte}(floatn data, size_t offset, half *p)
void vstore_halfn{rtz}(floatn data, size_t offset, half *p)
void vstore_halfn{rtp}(floatn data, size_t offset, half *p)
void vstore_halfn{rtn}(floatn data, size_t offset, half *p)

The floatn value given by data is converted to a halfn value using the appropriate rounding mode. n * sizeof(half) bytes from the halfn value are then written to the address computed as (p + (offset * n)). The computed address must be 16-bit aligned.

vstore_halfn uses the default rounding mode. The default rounding mode is round to nearest even.

void vstore_half(double data, size_t offset, half *p)
void vstore_half{rte}(double data, size_t offset, half *p)
void vstore_half{rtz}(double data, size_t offset, half *p)
void vstore_half{rtp}(double data, size_t offset, half *p)
void vstore_half{rtn}(double data, size_t offset, half *p)

The double value given by data is first converted to a half value using the appropriate rounding mode. The half value is then written to the address computed as (p + offset). The computed address must be 16-bit aligned.

vstore_half uses the default rounding mode. The default rounding mode is round to nearest even.

void vstore_halfn(doublen data, size_t offset, half *p)
void vstore_halfn{rte}(doublen data, size_t offset, half *p)
void vstore_halfn{rtz}(doublen data, size_t offset, half *p)
void vstore_halfn{rtp}(doublen data, size_t offset, half *p)
void vstore_halfn{rtn}(doublen data, size_t offset, half *p)

The doublen value given by data is converted to a halfn value using the appropriate rounding mode. n * sizeof(half) bytes from the halfn value are then written to the address computed as (p + (offset * n)). The computed address must be 16-bit aligned.

vstore_halfn uses the default rounding mode. The default rounding mode is round to nearest even.

floatn vloada_halfn(size_t offset, const half *p)
floatn vloada_halfn(size_t offset, const {constant} half *p)

For n = 2, 4, 8 and 16, read sizeof(halfn) bytes of data from the address computed as (p + (offset * n)). The data read is interpreted as a halfn value. The halfn value read is converted to a floatn value and the floatn value is returned. The computed address must be aligned to sizeof(halfn) bytes.

For n = 3, vloada_half3 reads a half3 from the address computed as (p + (offset * 4)) and returns a float3. The computed address must be aligned to sizeof(half) * 4 bytes.

void vstorea_halfn(floatn data, size_t offset, half *p)
void vstorea_halfn{rte}(floatn data, size_t offset, half *p)
void vstorea_halfn{rtz}(floatn data, size_t offset, half *p)
void vstorea_halfn{rtp}(floatn data, size_t offset, half *p)
void vstorea_halfn{rtn}(floatn data, size_t offset, half *p)

The floatn value given by data is converted to a halfn value using the appropriate rounding mode.

For n = 2, 4, 8 and 16, the halfn value is written to the address computed as (p + (offset * n)). The computed address must be aligned to sizeof(halfn) bytes.

For n = 3, the half3 value is written to the address computed as (p + (offset * 4)). The computed address must be aligned to sizeof(half) * 4 bytes.

vstorea_halfn uses the default rounding mode. The default rounding mode is round to nearest even.

void vstorea_halfn(doublen data, size_t offset, half *p)
void vstorea_halfn{rte}(doublen data, size_t offset, half *p)
void vstorea_halfn{rtz}(doublen data, size_t offset, half *p)
void vstorea_halfn{rtp}(doublen data, size_t offset, half *p)
void vstorea_halfn{rtn}(doublen data, size_t offset, half *p)

The doublen value is converted to a halfn value using the appropriate rounding mode.

For n = 2, 4, 8 or 16, the halfn value is written to the address computed as (p + (offset * n)). The computed address must be aligned to sizeof(halfn) bytes.

For n = 3, the half3 value is written to the address computed as (p + (offset * 4)). The computed address must be aligned to sizeof(half) * 4 bytes.

vstorea_halfn uses the default rounding mode. The default rounding mode is round to nearest even.

[38] vload3 and vload_half3 read (x,y,z) components from address (p + (offset * 3)) into a 3-component vector. vstore3 and vstore_half3 write (x,y,z) components from a 3-component vector to address (p + (offset * 3)). In addition, vloada_half3 reads (x,y,z) components from address (p + (offset * 4)) into a 3-component vector and vstorea_half3 writes (x,y,z) components from a 3-component vector to address (p + (offset * 4)). Whether vloada_half3 and vstorea_half3 read/write padding data between the third vector element and the next alignment boundary is implementation defined. vloada_ and vstorea_ variants are provided to access data that is aligned to the size of the vector, and are intended to enable performance on hardware that can take advantage of the increased alignment.

The results of vector data load and store functions are undefined if the address being read from or written to is not correctly aligned as described in https://www.khronos.org/registry/OpenCL/specs/2.2/html/OpenCL_C.html#table-vector-loadstore. The pointer argument p can be a pointer to global, local, or private memory for store functions described in https://www.khronos.org/registry/OpenCL/specs/2.2/html/OpenCL_C.html#table-vector-loadstore. The pointer argument p can be a pointer to global, local, constant, or private memory for load functions described in https://www.khronos.org/registry/OpenCL/specs/2.2/html/OpenCL_C.html#table-vector-loadstore.

The vector data load and store functions variants that take pointer arguments which point to the generic address space are also supported.

See Also

No cross-references are available

Document Notes

For more information, see the OpenCL C Specification

This page is extracted from the OpenCL C Specification. Fixes and changes should be made to the Specification, not directly.

Copyright (c) 2014-2020 Khronos Group. This work is licensed under a Creative Commons Attribution 4.0 International License.