There is no possibility to send data directly between work-groups, using async_work_group_copy for this is far from being optimal. My suggestion to have something like this:

event_t async_direct_work_group_copy (
wgtypen dst_work_group,
__local gentype *dst,
const __global gentype *src,
size_t num_gentypes,
event_t event);


dst_work_group - work group number in ND-Range, wgtypen, n=1,2,..,MAX_DIM