PDA

View Full Version : OpenCL performances on NVIDIA GTX 260 and ATI Radeon HD



enzo30980
12-10-2012, 12:23 PM
Hi, I wrote an OpenCL kernel doing the dot product between two double arrays. This is the code:
_kernel void evaluate_product(__global const double *pFirstArray, const int n,
__global const double *pSecondArray, __global double* pOutArray)
{
int gid = get_global_id(0); int size = get_global_size(0);
if (gid>=0 && gid <size) {
double output = 0.0f;
for (int k=0; k<n; k++)
output += pLocal[k]*pSecondArray[k];
pOutArray[gid] = output;
}
}

Why this kernel took 30 ms on NVIDIA GTX 260, while on ARI Radeon HD 6900 it took less then 10 ms?
Any ideas? Or some optimization to use in kernel for NVIDIA card?
Tks

bmerry
12-13-2012, 03:13 AM
Are you sure that's your code? I don't see how that can compile given that pLocal is never defined. I also don't see how it can be computing a dot product, given that it outputs an array rather than a single value. I'd suggest you search on the internet for a tutorial about dot products in OpenCL, or just use the BLAS libraries that AMD provide (APPML).