Hello and happy new year to everyone,

I recently started to use OpenCL to develop software. As first program I took a simple source code from a book and altered it. It compiles without problems and warnings but I still do not understand the result.

The program's kernel should do the following: It gets an array with, for example, 128 elements and should compute the mean and standard deviation for every 64 elements and store the standard deviation in the output array in a certain position. So when computing an array with 128 elements, there should be two standard deviations in the output array.

Unfortunately, when I compile and execute the program there are four values in my output array and I do not understand why.

The globalWorkSize = 128 and the localWorkSize = 64, so the complete array with 128 elements is devided into two workgroups with 64 work items each, right?

Here is the kernel I use:

Code :__kernel void hello_kernel(__global const float *src, __global float *temp, __global float *sigma) { int gid = get_global_id(0); int size = 64, i = 0, iweight = 31; float mean[1] = {0.0}, stdDev[1] = {0.0}; float sum = 0.0, sumPow = 0.0; float numerator = 0.0, denominator = 0.0; /*Compute array start position*/ const uint start = gid * 64; /*Mean and standard deviation*/ temp[gid] = src[gid]; for( int i = 0; i < size; i++) {sum = sum + temp[start + i]; sumPow = sumPow + temp[start + i] * temp[start + i];} numerator = (size*sumPow) - (pow(sum, 2.0)); denominator = (64 * (64-1)); mean[0] = sum/64; i = (int)(round(iweight * mean[0])); stdDev[0] = sqrt(numerator / denominator); if (stdDev[0] < sigma[i]) sigma[i] = stdDev[0]; }

My system:

Win 7 32 bit Prof.

GeForce 9600 GT 512MB RAM

Display Driver Version: 280.26

Visual Studio 2010 Prof.

I hope that someone can help me with my problem and thank you very much!!

Wolfheart