The OpenCL C programming language implements the following built-in
functions that operate on a work-group level.
These built-in functions must be encountered by all work-items in a
work-group executing the kernel.
We use the generic type name gentype
to indicate the built-in data types
half
^{59}, int
, uint
, long
, ulong
, float
or double
^{60} as the
type for the arguments.
[59] Only if the cl_khr_fp16 extension is supported and has been enabled.
[60] Only if double precision is supported.
Function |
Description |
int work_group_all(int predicate) |
Evaluates predicate for all work-items in the work-group and returns a non-zero value if predicate evaluates to non-zero for all work-items in the work-group. |
int work_group_any(int predicate) |
Evaluates predicate for all work-items in the work-group and returns a non-zero value if predicate evaluates to non-zero for any work-items in the work-group. |
gentype work_group_broadcast(gentype a, size_t local_id) |
Broadcast the value of x for work-item identified by local_id to all work-items in the work-group. local_id must be the same value for all work-items in the work-group. |
gentype work_group_reduce_<op>(gentype x) |
Return result of reduction operation specified by <op> for all values of x specified by work-items in a work-group. |
gentype work_group_scan_exclusive_<op>(gentype x) |
Do an exclusive scan operation specified by <op> of all values specified by work-items in the work-group. The scan results are returned for each work-item. The scan order is defined by increasing 1D linear global ID within the work-group. |
gentype work_group_scan_inclusive_<op>(gentype x) |
Do an inclusive scan operation specified by <op> of all values specified by work-items in the work-group. The scan results are returned for each work-item. The scan order is defined by increasing 1D linear global ID within the work-group. |
The <op> in work_group_reduce_<op>, work_group_scan_exclusive_<op> and work_group_scan_inclusive_<op> defines the operator and can be add, min or max.
The inclusive scan operation takes a binary operator op with an identity I
and n (where n is the size of the work-group) elements [a_{0}, a_{1}, …
a_{n-1}] and returns [a_{0}, (a_{0} op a_{1}), … (a_{0} op a_{1} op …
op a_{n-1})].
If op = add, the identity I is 0.
If op = min, the identity I is INT_MAX
, UINT_MAX
, LONG_MAX
,
ULONG_MAX
, for int
, uint
, long
, ulong
types and is +INF
for
floating-point types.
Similarly if op = max, the identity I is INT_MIN
, 0, LONG_MIN
, 0 and
-INF
.
Consider the following example:
void foo(int *p)
{
...
int prefix_sum_val = work_group_scan_inclusive_add(
p[get_local_id(0)]);
}
For the example above, let’s assume that the work-group size is 8 and p points to the following elements [3 1 7 0 4 1 6 3]. Work-item 0 calls work_group_scan_inclusive_add with 3 and returns 3. Work-item 1 calls work_group_scan_inclusive_add with 1 and returns 4. The full set of values returned by work_group_scan_inclusive_add for work-items 0 … 7 are [3 4 11 11 15 16 22 25].
The exclusive scan operation takes a binary associative operator op with an identity I and n (where n is the size of the work-group) elements [a_{0}, a_{1}, … a_{n-1}] and returns [I, a_{0}, (a_{0} op a_{1}), … (a_{0} op a_{1} op … op a_{n-2})]. For the example above, the exclusive scan add operation on the ordered set [3 1 7 0 4 1 6 3] would return [0 3 4 11 11 15 16 22].
The order of floating-point operations is not guaranteed for the
work_group_reduce_<op>, work_group_scan_inclusive_<op> and
work_group_scan_exclusive_<op> built-in functions that operate on |
