1) Kernel variables access:
In complex application we can have several level of functions calls, like this
The problem is that often we need to access the kernel parameters. Like the global buffers or simply the other values passed from the Host. When you have a lot of parameters (tell 30 by example) you can :
a) pass all the parameters to all the functions. In this case it is really a pain to maintains the program !
b) create a structure with all the kernel parameters and pass it (1 parameter) to all the functions.
The problem is that at each kernel call, this variable can use a lot of memory and on some device (like NVidia) it will greatly reduce the performance, simply because there are some 'registers' pressure and so you have more and more global memory access !!!!
In CUDA, the kernel parameters are see as some 'static' fields !!!! It is usefull !
2) Support of C++ classes and pointers in the CL kernels.
We need more support for complex applications.