Results 1 to 4 of 4

Thread: Strange issue when dealing with Bitmasks

Threaded View

  1. #1
    Newbie
    Join Date
    Jan 2014
    Posts
    2

    Strange issue when dealing with Bitmasks

    EDIT:
    Solved


    I am in the process of porting a graph based genetic algorithm and I keep coming across a strange problem. I generate my chromosomes on the cpu and offload them to the gpu. One of the steps of my fitness function is to determine how many bits are set to 1 (which would indicate the inclusion of a node). When trying to verify results on the cpu, the numbers are not matching up. First I figured my local caching was the issue, so I switched to using my global memory, to no avail. Next I figured maybe the integer modulus and division were the problem, so I tried re-implementing it using floating point operations and casts. Still not working. It seems to me that the chromosome isn't being copied properly for gid > 0. Has any one used bitmasks effectively on the gpu?

    here, chrome_local is an array
    uchar chrome_local[CHROM_SIZE_BYTES];

    Code :
    		totalChromOff = gid*popSize*CHROM_SIZE_BYTES+i*CHROM_SIZE_BYTES;
     
    		//copy the chromosome
    		for (int n = 0; n < CHROM_SIZE_BYTES; n++) {
    			chrome_local[n] = InputChroms[totalChromOff+n];
    		}
     
    		sSize = 0;
     
    		//count all zero size items
    		for (unsigned int item = 0; item < numVerts; item++) {
    			if (!isBitZero(chrome_local, item)) 
    			{
    				sSize++;
    			}
    		}
    		OutputFitness[gid*popSize+i] = (float)sSize;

    As a note: This is my first attempt at using OpenCl and I love the power. I just need to learn all the tricks of the trade
    If you have any questions or need more information please let me know!

    Note: It seems that the first workitem in the group calculates all of its sizes properly, but every other workitem is off. Might this have to do with memory access?

    Also, I've tried copying the chromosomes back after they are written and recalculating. It all gets copied correctly. So the problem either lies in the conditional being executed incorrectly for whatever reason for gid>0, or chrome_local not having the correct data for gid>0. The address gets calculated properly as far as I know. I'll try eliminating the conditional using a lookup table. If that doesn't fix it, chrome_local must not be copied correctly. Otherwise I guess I'm just crazy

    Okay now I've changed the code to use a lookup table and it doesn't work...

    Code :
    totalChromOff = (gid*POP_SIZE+i)*CHROM_SIZE_BYTES;	
    sSize = 0;
     
    //copy the chromosome
    for (int n = 0; n < CHROM_SIZE_BYTES; n++) {
    	chrome_loc[n] = InputChroms[totalChromOff+n];
    	sSize += LookupTable[ chrome_loc[n] ];
    }
     
    OutputFitness[gid*POP_SIZE+i] = (float)sSize; //testing

    Here is my code in the host
    Code :
    Buffer bufferMyChroms = Buffer(context, CL_MEM_READ_WRITE, CHROM_SIZE_BYTES*numTotalChromosomes * sizeof(cl_uchar));
    ...
    queue.enqueueWriteBuffer(bufferMyChroms, CL_TRUE, 0, sizeChrom*numTotalChromosomes, chromosomes);
    ...
    kernelGA.setArg(1, bufferMyChroms);
    Last edited by NelkQuyiter; 01-26-2014 at 06:23 PM. Reason: more tests

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •