Results 1 to 3 of 3

Thread: Precision problem

  1. #1
    Junior Member
    Join Date
    Apr 2012
    Posts
    6

    Precision problem

    Hi there, I have the following code that gives me a hard time.

    Code :
        local_density = 0.0;
        for(kk = 0; kk < 9; kk++)
        {
          local_density += tmp_cells[pos].speeds[kk];
        }
     
        u_x = (tmp_cells[pos].speeds[1] + tmp_cells[pos].speeds[5] +
               tmp_cells[pos].speeds[8] - ( tmp_cells[pos].speeds[3] +
                tmp_cells[pos].speeds[6] + tmp_cells[pos].speeds[7]))
              / local_density;
        u_y = (tmp_cells[pos].speeds[2] + tmp_cells[pos].speeds[5] +
               tmp_cells[pos].speeds[6] - ( tmp_cells[pos].speeds[4] +
                tmp_cells[pos].speeds[7] + tmp_cells[pos].speeds[8]))
              / local_density;
        u_sq = u_x * u_x + u_y * u_y;
        u[1] =   u_x      ;
        u[2] =         u_y;
        u[3] = - u_x      ;
        u[4] =       - u_y;
        u[5] =   u_x + u_y;
        u[6] = - u_x + u_y;
        u[7] = - u_x - u_y;
        u[8] =   u_x - u_y;
        t1 = 2.0 * c_sq;
        d_equ[0] = w0 * local_density * (1.0 - u_sq / t1);
        t3 = w1 * local_density;
        t2 = t1 * c_sq;
        t1 = u_sq / t1;
        d_equ[1] = t3 * (1.0 + u[1] / c_sq + (u[1] * u[1]) / t2 - t1);
        d_equ[2] = t3 * (1.0 + u[2] / c_sq + (u[2] * u[2]) / t2 - t1);
        d_equ[3] = t3 * (1.0 + u[3] / c_sq + (u[3] * u[3]) / t2 - t1);
        d_equ[4] = t3 * (1.0 + u[4] / c_sq + (u[4] * u[4]) / t2 - t1);
        t3 = w2 * local_density;
        d_equ[5] = t3 * (1.0 + u[5] / c_sq + (u[5] * u[5]) / t2 - t1);
        d_equ[6] = t3 * (1.0 + u[6] / c_sq + (u[6] * u[6]) / t2 - t1);
        d_equ[7] = t3 * (1.0 + u[7] / c_sq + (u[7] * u[7]) / t2 - t1);
        d_equ[8] = t3 * (1.0 + u[8] / c_sq + (u[8] * u[8]) / t2 - t1);
     
        for(kk = 0; kk < 9; kk++)
        {
          cells[pos].speeds[kk] = (tmp_cells[pos].speeds[kk] + params->omega *
               (d_equ[kk] - tmp_cells[pos].speeds[kk]));
        }

    My problem is that when the specific code runs using OpenCL it gives me faulty/different values compared to the serial execution and after a few execution of the three specific kernels (this one is one of them and each of them just updates the cells and tmp_cells values using some calculations) it drives to segmentation fault. When I test the other two kerrnels the results are correct (same to the serial execution) so I guess only the third kernel (this one) and more specificilly this piece of code gives me the problem.

    I have to say that the code that runs in the serial execution is exactly the same. No difference at all. What am I doing wrong here? Am I missing something?

    as for the definition of the struct it is the following

    Code :
    typedef struct {
      double speeds[NSPEEDS];
    } t_speed

    One last thing that I observed is that when I only change the value of only one the the speeds in the last loop more results are as expected than when I change all of them. I really can't understand why this happens...

  2. #2
    Junior Member
    Join Date
    Apr 2012
    Posts
    6

    Re: Precision problem

    I also want to add a couple more things that I observed. After executing the above code (and also the two other kernels) inside the loop it always gives a segmentation fault in a specific point (even with trying to compile the OpenCL code without optimizations).

    Also if I comment out the last loop
    Code :
    for(kk = 0; kk < 9; kk++)
        {
          cells[pos].speeds[kk] = (tmp_cells[pos].speeds[kk] + params->omega *
               (d_equ[kk] - tmp_cells[pos].speeds[kk]));
        }
    everything works fine (meaning no segmentation fault).

  3. #3
    Junior Member
    Join Date
    Apr 2012
    Posts
    6

    Re: Precision problem

    More news.

    When I target the CPU for the execution everything works great. The results are as expected and I get no segmentation faults. So what is the deference between running the OpenCL code in the CPU and in the GPU as a matter of code?

Similar Threads

  1. pow precision
    By yoavhacohen in forum OpenCL
    Replies: 2
    Last Post: 02-03-2012, 02:34 AM
  2. high precision
    By howaidi in forum OpenCL
    Replies: 2
    Last Post: 11-21-2011, 03:59 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •