## Description

The following table describes the built-in integer functions that take scalar or vector arguments. The vector versions of the integer functions operate component-wise. The description is per-component.

We use the generic type name gentype to indicate that the function can take char, char{2|3|4|8|16}, uchar, uchar{2|3|4|8|16}, short, short{2|3|4|8|16}, ushort, ushort{2|3|4|8|16}, int, int{2|3|4|8|16}, uint, uint{2|3|4|8|16}, long, long{2|3|4|8|16} ulong, or ulong{2|3|4|8|16} as the type for the arguments. We use the generic type name ugentype to refer to unsigned versions of gentype. For example, if gentype is char4, ugentype is uchar4. We also use the generic type name sgentype to indicate that the function can take a scalar data type, i.e. char, uchar, short, ushort, int, uint, long, or ulong, as the type for the arguments. For built-in integer functions that take gentype and sgentype arguments, the gentype argument must be a vector or scalar version of the sgentype argument. For example, if sgentype is uchar, gentype must be uchar or uchar{2|3|4|8|16}. For vector versions, sgentype is implicitly widened to gentype as described for arithmetic operators.

For any specific use of a function, the actual type has to be the same for all arguments and the return type unless otherwise specified.

 Function Description ugentype abs(gentype x) Returns |x|. ugentype abs_diff(gentype x, gentype y) Returns |x - y| without modulo overflow. gentype add_sat(gentype x, gentype y) Returns x + y and saturates the result. gentype hadd(gentype x, gentype y) Returns (x + y) >> 1. The intermediate sum does not modulo overflow. gentype rhadd(gentype x, gentype y)32 Returns (x + y + 1) >> 1. The intermediate sum does not modulo overflow. gentype clamp(gentype x, gentype minval, gentype maxval) gentype clamp(gentype x, sgentype minval, sgentype maxval) Returns min(max(x, minval), maxval). Results are undefined if minval > maxval. gentype clz(gentype x) Returns the number of leading 0-bits in x, starting at the most significant bit position. If x is 0, returns the size in bits of the type of x or component type of x, if x is a vector. gentype ctz(gentype x) Returns the count of trailing 0-bits in x. If x is 0, returns the size in bits of the type of x or component type of x, if x is a vector. gentype mad_hi(gentype a, gentype b, gentype c) Returns mul_hi(a, b) + c. gentype mad_sat(gentype a, gentype b, gentype c) Returns a * b + c and saturates the result. gentype max(gentype x, gentype y) gentype max(gentype x, sgentype y) Returns y if x < y, otherwise it returns x. gentype min(gentype x, gentype y) gentype min(gentype x, sgentype y) Returns y if y < x, otherwise it returns x. gentype mul_hi(gentype x, gentype y) Computes x * y and returns the high half of the product of x and y. gentype rotate(gentype v, gentype i) For each element in v, the bits are shifted left by the number of bits given by the corresponding element in i (subject to the usual shift modulo rules). Bits shifted off the left side of the element are shifted back in from the right. gentype sub_sat(gentype x, gentype y) Returns x - y and saturates the result. short upsample(char hi, uchar lo) ushort upsample(uchar hi, uchar lo) shortn upsample(charn hi, ucharn lo) ushortn upsample(ucharn hi, ucharn lo) result[i] = ((short)hi[i] << 8) | lo[i] result[i] = ((ushort)hi[i] << 8) | lo[i] int upsample(short hi, ushort lo) uint upsample(ushort hi, ushort lo) intn upsample(shortn hi, ushortn lo) uintn upsample(ushortn hi, ushortn lo) result[i] = ((int)hi[i] << 16) | lo[i] result[i] = ((uint)hi[i] << 16) | lo[i] long upsample(int hi, uint lo) ulong upsample(uint hi, uint lo) longn upsample(intn hi, uintn lo) ulongn upsample(uintn hi, uintn lo) result[i] = ((long)hi[i] << 32) | lo[i] result[i] = ((ulong)hi[i] << 32) | lo[i] gentype popcount(gentype x) Returns the number of non-zero bits in x.

[32] Frequently vector operations need n + 1 bits temporarily to calculate a result. The rhadd instruction gives you an extra bit without needing to upsample and downsample. This can be a profound performance win.

The following table describes fast integer functions that can be used for optimizing performance of kernels. We use the generic type name gentype to indicate that the function can take int, int2, int3, int4, int8, int16, uint, uint2, uint3, uint4, uint8 or uint16 as the type for the arguments.

 Function Description gentype mad24(gentype x, gentype y, gentype z) Multipy two 24-bit integer values x and y and add the 32-bit integer result to the 32-bit integer z. Refer to definition of mul24 to see how the 24-bit integer multiplication is performed. gentype mul24(gentype x, gentype y) Multiply two 24-bit integer values x and y. x and y are 32-bit integers but only the low 24-bits are used to perform the multiplication. mul24 should only be used when values in x and y are in the range [-223, 223-1] if x and y are signed integers and in the range [0, 224-1] if x and y are unsigned integers. If x and y are not in this range, the multiplication result is implementation-defined.