i have two algorithm. The first process an image and computes a set of points of interest (like corners). The second takes an image and a set of interest point as input parameters and does some computations (for every point returns an array of float).
The two algorithm are unrelated, in the sense that i do not want to fix things (that is, i want to be able to change the interest point detector and the second algorithm easily, at run-time if needed).
I see two possibility.
Define two different kernels, call them A (1° algo) and B (2° algo) and do the following:
1.prepare kernel A
3.get results from A
4. prepare kernel B
5. call B (passing to it the results of A)
6. get results from B
The only question about this approach is: at step 3, i do not want to tranfer memory from the device to the host, so i can avoid to enqueue a copy command. I leave the results of A on the device, since B will be executed on it. Is this ok? (i am sure the answer is a big YES, but since i am really starting with OpenCL...).
The second approach would be different. I would have a *kernel*C that will call A, and, for every point found by A, will call B. I can't see how this could be made to work.
Any idea? I am almost sure that the first approach is the correct one, but i would like to know if i can call other kernels (and not local functions) from a given kernel and how to do it.