Hello everyone,

So I was thinking the best way to go at this is just attach my code and explain what is going on.

http://www.2shared.com/file/qJPty_V1/hello.html
http://www.2shared.com/file/ShsAHPXx/defs.html
http://www.2shared.com/file/1b97qaQd/BCkernel_test.html (I cant seem to have attach option on this forum).

So what I am doing is I want to parallelize the betweenness algorithm of Barnes.

For the example: ./hello_world -parallel -grid 3 4 with two work_groups(6 threads per workgroup) the following graph is constructed
0 1 2 3
4 5 6 7
8 9 10 11 so a grid.

Now the value of sigma for node 0 is 1, and for all other nodes is 0. The kernel computes the sigma for all the other nodes in parallel, which means that sigma of a node will be the sum of his own sigma and the sigma of the root.

So sigma of node 1 and 4, will be value 1, sigma of node 5 will be 2 etc.

Now the problem is that in the while loop "while(count_priv<nr_roots)" count_priv will be initially 0 and it so happens that once every 4-5 runs of the program, count_priv does not get incremented for one of the work_groups. So what happens is that the iteration gets done with count_priv = 0 , gets incremented and when the while test (while(count_priv<nr_roots)) is done the value is still 0. Any ideas why that might be?

Thank you for your time and hope someone has a solution for me!