## Name Strings

cl_intel_required_subgroup_size

## Contact

Ben Ashbaugh, Intel (ben 'dot' ashbaugh 'at' intel 'dot' com)

## Contributors

Ben Ashbaugh, Intel

Final Draft

## Version

Built On: 2019-10-23
Revision: 3

## Dependencies

Support for OpenCL 2.1, cl_khr_subgroups, or cl_intel_subgroups is required. This extension is written against revision 23 of the OpenCL 2.1 API specification, against revision 30 of the OpenCL 2.0 OpenCL C specification, against version 31 of the OpenCL 2.0 Extensions specification, and against version 3 of the cl_intel_subgroups specification.

## Overview

The goal of this extension is to allow programmers to optionally specify the required subgroup size for a kernel function. This information is important for the correctness of many subgroup algorithms, and in some cases may be used by the compiler to generate more optimal code.

None.

## New API Enums

Accepted as the param_name parameter of clGetDeviceInfo:

CL_DEVICE_SUB_GROUP_SIZES_INTEL                 0x4108

Accepted as the param_name parameter of clGetKernelWorkGroupInfo:

CL_KERNEL_SPILL_MEM_SIZE_INTEL                  0x4109

Accepted as the param_name parameter of clGetKernelSubGroupInfo and/or clGetKernelSubGroupInfoKHR:

CL_KERNEL_COMPILE_SUB_GROUP_SIZE_INTEL          0x410A

## New OpenCL C Optional Attribute Qualifiers

Optional __kernel qualifier:

__attribute__((intel_reqd_sub_group_size(<int>)))

## Modifications to the OpenCL API Specification

### Additions to Table 4.3 - "OpenCL Device Queries"

cl_device_info Return Type Description

CL_​DEVICE_​SUB_​GROUP_​SIZES_​INTEL

size_t[]

Returns the set of subgroup sizes supported by the device.

### Additions to Table 5.21 - "clGetKernelWorkGroupInfo parameter queries":

cl_kernel_work_group_info Return Type Info. returned in param_value

CL_​KERNEL_​SPILL_​MEM_​SIZE_​INTEL

cl_ulong

Returns the amount of spill memory used by a kernel. The meaning of this value will vary from implementation-to-implementation, however a return value of 0 will always indicate that compiler was able to compile the kernel to fit into the device’s register file without spilling registers to memory.

### Additions to "clGetKernelSubGroupInfo parameter queries":

This is Table 5.22 - "clGetKernelSubGroupInfo parameter queries" in the OpenCL 2.1 API spec, in Section 9.17.2.1 for clGetKernelSubGroupInfoKHR in the OpenCL 2.0 Extensions spec, and in the section describing the changes to Section 5.9.3 for clGetKernelSubGroupInfoKHR in the cl_intel_subgroups spec:

cl_kernel_sub_group_info Input Type Return Type Info. returned in param_value

CL_​KERNEL_​COMPILE_​SUB_​GROUP_​SIZE_​INTEL

ignored

size_t

Returns the subgroup size specified by the __attribute__((intel_reqd_sub_group_size(<int>))) qualifier. Refer to section 6.7.2.

If the subgroup size is not specified using the above attribute qualifier then 0 is returned.

## Modifications to the OpenCL C Specification

### Additions to Section 6.7.2 - "Optional Attribute Qualifiers"

The optional __attribute__((intel_reqd_sub_group_size(<int>))) can be used to indicate that the kernel must be compiled and executed with the specified subgroup size. When this attribute is present, get_max_sub_group_size() is guaranteed to return the specified integer value. This is important for the correctness of many subgroup algorithms, and in some cases may be used by the compiler to generate more optimal code.

Note that there is no guarantee for the value of get_sub_group_size() even when this attribute is present, particularly when the work-group size is not evenly divisible by the required subgroup size.

Note as well that some devices may support a limited number of subgroup sizes, and that some devices may not support all language constructs with all subgroup sizes. This means that some kernels may fail compilation with one required subgroup size and succeed with another required subgroup size, even if both subgroup sizes are supported by the device.

Finally, note that requiring one subgroup size (particularly, a larger subgroup size) may require more spill memory than another subgroup size, and may negatively impact application performance."

None.

## Revision History

Rev Date Author Changes

1

2016-07-14

Ben Ashbaugh

First public revision.

2

2018-11-15

Ben Ashbaugh

Conversion to asciidoc.

3

2019-09-17

Ben Ashbaugh

Minor formatting fixes for asciidoctor.