Simple question related to the code bellow. It's not a real code, it doesn't compile, it's just a simple example.
We basically have 3 images allocated on the device. The function f(input, output, p) apply a kernel that fills the output with values read from input given a parameter p. For instance, f could be a Gaussian smoothing where p would be the variance of the Gaussian kernel.Code :cl_mem image0; cl_mem image1; cl_mem image2; cl_mem buffer; ... // Step1 f(image0, image1, 1); f(image0, image2, 2); g(image1, image2, buffer); // Step2 f(image0, image1, 3); f(image0, image2, 4); g(image1, image2, buffer); // Step3 f(image0, image1, 5); f(image0, image2, 6); g(image1, image2, buffer);
The function g takes two image inputs and a buffer as an output. In g, the kernel analyses the two inputs and write something in the output buffer. For instance, g could detect the local maxima in both inputs.
Because we apply the "algorithm" 3 times here (3 steps) and because we re-use the same memory space at each step (image0 and image1, buffer grows at each step), I was thinking that maybe I should use a clFinish() between each step. I'm affraid that if I don't, the step 2 may start before step 1 is finished which would lead to an incorrect behavior of function g in step 1.
What do you think?