Every time setKernelArgs is called it calls clreleaseMemObject then clRetainMemObject. This really adds up. for 10 setKernelArgs it is taking at least 0.2-0.4ms.