Hi guys,

That problem has tortured me for months.

My kernel has a lot of floating-point calculations and it should give the same answer all the time. However, about 2 out of 10 runs will give slightly different results.

Currently I use NVIDIA GTX 9800, which is a very very old card and we are waiting for some new card. The current card has compute capability 1.0 and only supports float.

I suspect the problem may be because of some floating-point optimizations, so I use "-cl-opt-disable" when I build the program, using clBuildProgram(program, 0, NULL, "-cl-opt-disable", NULL, NULL);

Now it gives -36 (CL_INVALID_COMMAND) when I call clFinish(commandQueue) after the kernel launch.

After some tracking, I come to a very very confusing situation. Below is an inline function in my *.cl file which is called by a __kernel function.

inline int gFindCell(float *part, __float int *conn, __global int *link, __global float *vert, float epsilon, int gues, float *coor) {
float tetX[4], tetY[4], tetZ[4];

int cnt = 0;
while (true) {
cnt++;
if (cnt == 1) break;

(critical place).............. Many codes here ..................
}

return 0;
}

After tracking I found that when the kernel calls that function, it will get -36 error, so I want to use cnt to find out in which loop it has the error.

When I use "if (cnt == 1) break;", the critical place should not be executed and the function should always exit successfully intuitively. However, the kernel still gets -36 error. A strange thing is if I directly comment out the whole while loop or use "cnt = 1" instead of "cnt++", the kernel will finish without error. It feels like the variable "cnt" is manipulated by multiple threads.

However, when I build the program without -cl-opt-disable, the kernel can finish successfully even with "cnt++". Seems like compiling with -cl-opt-disable will make the code strange.

Could anybody please give me some help on that problem? It is a nightmare.

Thanks!

Best regards,
Mingcheng Chen
November 2nd, 2012