Currently there is only the software abstracted get_num_groups and get_local_id, but how about the querying about the hardware those are mapped to, something like get_compute_unit_id or get_processing_element_id (and size functions too)? For example, if one were to use atomic operations (I know, they're slow) to perform some kind of reduction, then knowing the compute unit or processing element a work group and work item were mapped to could permit doing local atomic operations rather than global atomic operations.