Description
The following table describes the built-in integer functions that take scalar or vector arguments. The vector versions of the integer functions operate component-wise. The description is per-component.
We use the generic type name gentype
to indicate that the function can
take char
, char{2|3|4|8|16}
, uchar
, uchar{2|3|4|8|16}
, short
,
short{2|3|4|8|16}
, ushort
, ushort{2|3|4|8|16}
, int
,
int{2|3|4|8|16}
, uint
, uint{2|3|4|8|16}
, long
, long{2|3|4|8|16}
ulong
, or ulong{2|3|4|8|16}
as the type for the arguments.
We use the generic type name ugentype
to refer to unsigned versions of
gentype
.
For example, if gentype
is char4
, ugentype
is uchar4
.
We also use the generic type name sgentype
to indicate that the function
can take a scalar data type, i.e. char
, uchar
, short
, ushort
, int
,
uint
, long
, or ulong
, as the type for the arguments.
For built-in integer functions that take gentype
and sgentype
arguments,
the gentype
argument must be a vector or scalar version of the sgentype
argument.
For example, if sgentype
is uchar
, gentype
must be uchar
or
uchar{2|3|4|8|16}
.
For vector versions, sgentype
is implicitly widened to gentype
as
described for arithmetic operators.
For any specific use of a function, the actual type has to be the same for all arguments and the return type unless otherwise specified.
Function |
Description |
ugentype abs(gentype x) |
Returns |x|. |
ugentype abs_diff(gentype x, gentype y) |
Returns |x - y| without modulo overflow. |
gentype add_sat(gentype x, gentype y) |
Returns x + y and saturates the result. |
gentype hadd(gentype x, gentype y) |
Returns (x + y) >> 1. The intermediate sum does not modulo overflow. |
gentype rhadd(gentype x, gentype y)^{32} |
Returns (x + y + 1) >> 1. The intermediate sum does not modulo overflow. |
gentype clamp(gentype x, gentype minval, gentype maxval) |
Returns min(max(x, minval), maxval). Results are undefined if minval > maxval. |
gentype clz(gentype x) |
Returns the number of leading 0-bits in x, starting at the most significant bit position. If x is 0, returns the size in bits of the type of x or component type of x, if x is a vector. |
gentype ctz(gentype x) |
Returns the count of trailing 0-bits in x. If x is 0, returns the size in bits of the type of x or component type of x, if x is a vector. |
gentype mad_hi(gentype a, gentype b, gentype c) |
Returns mul_hi(a, b) + c. |
gentype mad_sat(gentype a, gentype b, gentype c) |
Returns a * b + c and saturates the result. |
gentype max(gentype x, gentype y) |
Returns y if x < y, otherwise it returns x. |
gentype min(gentype x, gentype y) |
Returns y if y < x, otherwise it returns x. |
gentype mul_hi(gentype x, gentype y) |
Computes x * y and returns the high half of the product of x and y. |
gentype rotate(gentype v, gentype i) |
For each element in v, the bits are shifted left by the number of bits given by the corresponding element in i (subject to the usual shift modulo rules). Bits shifted off the left side of the element are shifted back in from the right. |
gentype sub_sat(gentype x, gentype y) |
Returns x - y and saturates the result. |
short upsample(char hi, uchar lo) |
result[i] = ((short)hi[i] << 8) | lo[i] |
int upsample(short hi, ushort lo) |
result[i] = ((int)hi[i] << 16) | lo[i] |
long upsample(int hi, uint lo) |
result[i] = ((long)hi[i] << 32) | lo[i] |
gentype popcount(gentype x) |
Returns the number of non-zero bits in x. |
[32] Frequently vector operations need n + 1 bits temporarily to calculate a result. The rhadd instruction gives you an extra bit without needing to upsample and downsample. This can be a profound performance win.
The following table describes fast integer functions that can be used for
optimizing performance of kernels.
We use the generic type name gentype
to indicate that the function can
take int
, int2
, int3
, int4
, int8
, int16
, uint
, uint2
,
uint3
, uint4
, uint8
or uint16
as the type for the arguments.
Function |
Description |
gentype mad24(gentype x, gentype y, gentype z) |
Multipy two 24-bit integer values x and y and add the 32-bit integer result to the 32-bit integer z. Refer to definition of mul24 to see how the 24-bit integer multiplication is performed. |
gentype mul24(gentype x, gentype y) |
Multiply two 24-bit integer values x and y. x and y are 32-bit integers but only the low 24-bits are used to perform the multiplication. mul24 should only be used when values in x and y are in the range [-2^{23}, 2^{23}-1] if x and y are signed integers and in the range [0, 2^{24}-1] if x and y are unsigned integers. If x and y are not in this range, the multiplication result is implementation-defined. |
See Also
Document Notes
For more information, see the OpenCL C Specification
This page is extracted from the OpenCL C Specification. Fixes and changes should be made to the Specification, not directly.
Copyright
Copyright (c) 2014-2020 Khronos Group. This work is licensed under a Creative Commons Attribution 4.0 International License.