I'd like to develop a graphics program where end users can create their own arbitrary 'script' functions inside my main program. However, the catch is that their 'script' will have to run on the GPU for speed. Ideally, their script would be compiled or use JIT for speed.
For this kind of thing, I imagine I would need to redistribute the OpenCL compiler along with my own software.
To complicate things slightly, I'm currently using CUDA for the main code base. However, since it generates PTX like CUDA, I was thinking that I could redistribute the OpenCL compiler with my software solely for the purpose of generating PTX files when the user wants a function to run on the GPU. From what I understand, I can then use the Driver API in CUDA to call these PTX files (sort of treating them like DLLs) and have them run transparently (need to be CUDA 2.1 or higher I think) along with CUDA code.
I would appreciate any insight if what I am asking is possible. Maybe there are other ways of going about the problem of allowing arbitrary user code at runtime? Examples may include the new GPU.NET or at a stretch, creating my own PTX and/or CUBIN compiler (shudder).
A few mini-questions also:
1: To make my task easier, can OpenCL directly generate CUBIN files?
2: Can I redistribute the OpenCL compiler, and if so, what's the latency time compiling from source to object file (presume a tiny source code function of about 4 lines) ?
3: If I were to switch to OpenCL entirely, can I use it to create DLL files and for the main code to use an arbitrary function from within the DLL to execute (pointer to function needed I think) ?