This can seems an odd request but i do think it would offer new possibilities to OpenCL if the goto statement (for a warp/wavefront) was available. Let me try to explain.
1) It would only be available for a warp/wavefront or all work items in a work group, so it wouldn't break flow divergence rule.
2) Because opencl doesn't have a callstack, it would allow programmer to implement (complex) recursive algorithms, by using a manually implemented callstack and repositioning the instruction pointer when the recursive function call end.
Not a bad practice, OpenCL it quite low level anyway.
3) it could make assembly code (especially AMD IL) easier to read and maybe make it more reliable, because yes I'm sorry to say it to AMD but AMD IL isn't as good as NVidia PTX, when it comes to very big and tricky kernels the AMD code fail HARD and the NVidia PTX doesn't. I think the reasons are too many aggressive optmizations and the lack of jump statement in assembly code (when PTX has bra.uni intruction).