Hello,

Reaction-diffusion systems are a bit like cellular automata, except that they work on floating point values and use a differential equation to compute the new value. I would have thought this was an ideal system for OpenCL to work on.

Here's an example of one of our OpenCL kernels:

Code :__kernel void grayscott_compute( __global float *U,__global float *V, __global float *U2, __global float *V2, float k,float F,float D_u,float D_v,float delta_t) { const int x = get_global_id(0); const int y = get_global_id(1); const int X = get_global_size(0); const int Y = get_global_size(1); const int i = x*Y+y; const float u = U[i]; const float v = V[i]; // compute the Laplacians of U and V const int xm1 = max(x-1,0); const int xp1 = min(x+1,X-1); const int ym1 = max(y-1,0); const int yp1 = min(y+1,Y-1); const int iLeft = xm1*Y + y; const int iRight = xp1*Y + y; const int iUp = x*Y + ym1; const int iDown = x*Y + yp1; // Standard 5-point stencil const float nabla_u = U[iLeft] + U[iRight] + U[iUp] + U[iDown] - 4*u; const float nabla_v = V[iLeft] + V[iRight] + V[iUp] + V[iDown] - 4*v; // compute the new rate of change (Gray-Scott) const float delta_u = D_u * nabla_u - u*v*v + F*(1.0f-u); const float delta_v = D_v * nabla_v + u*v*v - (F+k)*v; // apply the change (to the new buffer) U2[i] = u + delta_t * delta_u; V2[i] = v + delta_t * delta_v; }

Comparing our OpenCL implementations with CPU versions (e.g. using SSE) we're finding that OpenCL doesn't give us the performance expected. Here's a page comparing the speeds, with links to the different implementations:

http://code.google.com/p/reaction-di...eedComparisons

You can see that we've been trying different things, including image2d_t and float4 (these help). We try to keep all the data on the card over many iterations. Trying to use local data manually made things slower (GrayScott_OpenCL_Local). Using NDRange local(8,8) helps a lot, compared to local(1,1).

Is there something obvious we can do to improve our OpenCL code? Were my expectations that OpenCL would work well for reaction-diffusion wrong?

Thanks,

Tim

http://www.sq3.org.uk - tim.hutton@gmail.com