Looping kernels produce not constant timings
Hi OpenCL community,
I would appreciate if any of you can help me with the following issue. I have a program in which I use the same kernels over and over inside a "for" loop. The pseudo-code of my program is the following
Initilialize OpenCL (devices, queue, kernels, create buffers, set arguments, etc)
rewrite buffers with CL_TRUE enabled
run kernel 1
run kernel n
read output buffer
C functions using the output
Where tic and toc are time measurement functions similar to Matlab which I use to profile the performance of my code. I am not using the OpenCL profiler functions because I am working with the Nexus10 and they are not working.properly.
My question is the following:
When I plot the times for all the running kernels, I observe that there are iterations in which they are not relatively constant (it starts at some timing value and then randomly jump to a higher time for some iterations and then it goes back to a time that is between the min (expected one) and the maximum) as it should be. Do anyone have a hint of what may be causing this?.
I tried changing the clFinish with clFlush, using both or none. Also, when I run only one iteration of the process with the same input that produces the maximum value it works fine producing the minimum expected time. Finally, if I add a sleep(100ms) at the end of the loop the times are constant (at the minimum value) for all the kernels as they should be.
Thanks for your time and advise.