Is is possible to have a ocl kernel taking as input two float arrays of length N and writing the results to one output array of dimension NxN? Thanks!