Hi,
i have a HW to write an optimized kernel for 2d convolution using OpenCl, I write it and its work fine but i want to use an optimization called "register tiling", its mean i have to use the the...