Page 1 of 2 12 LastLast
Results 1 to 10 of 17

Thread: Kernel compilation problem....

  1. #1
    Junior Member
    Join Date
    Oct 2010
    Posts
    4

    Kernel compilation problem....

    I'm having a weird problem when trying to create a program from source code...

    This code works just fine in the program I've been working on:

    Code :
    const char * source = "__kernel void render(__global float *d_output, __global float *vertTemplate,\
    __global float *normTemplate, __global float *offTemplate,\
    __global unsigned int *cubeCoords,__global unsigned int *cubeCount)\
    {\
    unsigned int x = get_global_id(0);\
    unsigned int y = get_global_id(1);\
    float newCoord = (*(offTemplate+((x*3)+(y%3))))+(*(vertTemplate+y));\
    unsigned int i = ((*(cubeCoords))*x)+y;\
    *(d_output+i) = newCoord;\
    *(d_output+(i+(((*(cubeCount))*(*(cubeCoords)))))) = (*(normTemplate+y));\
    }\
    }";

    but this code (the same exact code but with array subscripts) doesn't work:

    Code :
    const char * source = "__kernel void render(__global float *d_output, __global float *vertTemplate,\
    __global float *normTemplate, __global float *offTemplate,\
    __global unsigned int *cubeCoords,__global unsigned int *cubeCount)\
    {\
    unsigned int x = get_global_id(0);\
    unsigned int y = get_global_id(1);\
    float newCoord = offTemplate[(x*3)+(y%3)]+vertTemplate[y];\
    unsigned int i = (cubeCoords*x)+y;\
    d_output[i] = newCoord;\
    d_output[i+(cubeCount*cubeCoords)] = normTemplate[y];\
    }\
    }";

    Same happens when I change other things too such as an unsigned int to a uint. I can't use functions like barrier() inside the source kernel either.... Any ideas as to why I'm getting these errors?

    BTW, I'm a total beginner with OpenCL. I'm also using Visual C++ 2008 Express Edition

  2. #2
    Junior Member
    Join Date
    Oct 2010
    Posts
    4

    Re: Kernel compilation problem....

    Minor editing error. I forgot to take out a curly brace that I already fixed before posting.

    }\
    }";

    is supposed to be

    }";

    in both of those kernels, but the same problem persists.

  3. #3
    Member
    Join Date
    Mar 2010
    Location
    Raleigh, NC
    Posts
    55

    Re: Kernel compilation problem....

    A few questions to get us going...

    Are you using AMD's or NVIDIA's OpenCL implementation?

    What error code is the compilation returning?

    Have you attempted to read the build log of the kernel?

  4. #4
    Senior Member
    Join Date
    May 2010
    Location
    Toronto, Canada
    Posts
    845

    Re: Kernel compilation problem....

    What HolyGeneralK said plus the following. This code:

    Code :
    *(d_output+(i+(((*(cubeCount))*(*(cubeCoords)))))) = (*(normTemplate+y));\

    Doesn't do the same as:

    Code :
    d_output[i+(cubeCount*cubeCoords)] = normTemplate[y];\

    Do you realize that both cubeCount and cubeCoords are pointers?
    Disclaimer: Employee of Qualcomm Canada. Any opinions expressed here are personal and do not necessarily reflect the views of my employer. LinkedIn profile.

  5. #5
    Junior Member
    Join Date
    Oct 2010
    Posts
    4

    Re: Kernel compilation problem....

    Quote Originally Posted by HolyGeneralK
    Are you using AMD's or NVIDIA's OpenCL implementation?
    Not sure what you're asking here, but I'll give it my best shot:
    • I have an NVIDIA GeForce GT 240 graphics card[/*:m:13uur6yk]
    • I am only including the cl.h header file along with glew.h, glut.h, and SDL.h[/*:m:13uur6yk]
    • NOT using the "ocl" API's in the oclUtils.h header file which I've heard is NVIDIA related[/*:m:13uur6yk]
    • I downloaded the current header files I'm using from http://www.khronos.org/registry/cl/ under the OpenCL 1.0 section.[/*:m:13uur6yk]


    Quote Originally Posted by HolyGeneralK
    What error code is the compilation returning?
    CL_INVALID_PROGRAM_EXECUTABLE on the clCreateKernel() API

    Quote Originally Posted by HolyGeneralK
    Have you attempted to read the build log of the kernel?
    I tried re-creating the problem after reading this question, but as I was re-creating the problem, I stumbled on a syntax error in my kernel... I was using the address when I should have used the value the address pointed to.

    Example: When I was removing the pointer notation from all the arrays I also removed the * before one of my pointers. So *(cubeCoords) became cubeCoords.

    As for not being able to use barrier(), I was looking at an out dated pdf (or a pdf with a typo) and it was saying to use barrier(GLOBAL_MEM_FENCE) instead of barrier(CLK_GLOBAL_MEM_FENCE). And the uint works just fine now....

    I'm now using a 3D NDRange to calculate the vertex positions for all cube vertices on a 16 X 16 grid of cubes and it's finally starting to work the way I want it to. Only problem I'm having now is that only an 8X8 grid of cubes is showing up instead of a 16X16 grid. I'm sure I'll be able to figure it out though, I've gotten this far in less than a week.

  6. #6
    Junior Member
    Join Date
    Oct 2010
    Posts
    4

    Re: Kernel compilation problem....

    Figured it out and I now have a 16X16 grid of cubes. Just had to change two lines of code:

    From:
    unsigned int x = get_global_id(0);
    unsigned int y = get_global_id(1);

    To:
    int x = (int)get_global_id(0);
    int y = (int)get_global_id(1);


    Thanks guys for your help and your time

  7. #7

    Re: Kernel compilation problem....

    Hi Friends,

    Well I beginner in OpenCL..

    I am trying to sum a list of num.... But I am getting Error..

    Error: clBuildProgram(-11)

    Please, See my code and help me to Solve it...

    <code cpp>
    #include <iostream>
    #include <cstdlib>
    #include <fstream>
    #include <string>
    //#include <CL/cl.h>
    #include <stdio.h>
    #include <stdlib.h>

    #ifdef __APPLE__
    #include <OpenCL/opencl.h>
    #else
    #include <CL/cl.h>
    #endif


    #define MAT_SIZE 4096

    using namespace std;

    void err_check( int err, string err_code ) {
    if ( err != CL_SUCCESS ) {
    cout << "Error: " << err_code << "(" << err << ")" << endl;
    exit(-1);
    }
    }

    int main()
    {
    cl_platform_id platform_id = NULL;
    cl_device_id device_id = NULL;
    cl_context context = NULL;
    cl_command_queue command_queue = NULL;
    cl_program program = NULL;
    cl_kernel kernel = NULL;
    cl_uint ret_num_devices;
    cl_uint ret_num_platforms;
    cl_int err;
    float mat_a[ MAT_SIZE ];
    for ( cl_int i = 0; i < MAT_SIZE; i++ ) {
    mat_a[i] = i;
    }

    // Step 01: Get platform/device information
    err = clGetPlatformIDs( 1, &platform_id, &ret_num_platforms );
    err_check( err, "clGetPlatformIDs" );

    // Step 02: Get information about the device
    err = clGetDeviceIDs( platform_id, CL_DEVICE_TYPE_DEFAULT, 1, &device_id, &ret_num_devices );
    err_check( err, "clGetDeviceIDs" );

    // Step 03: Create OpenCL Context
    context = clCreateContext( NULL, 1, &device_id, NULL, NULL, &err );
    err_check( err, "clCreateContext" );

    // Step 04: Create Command Queue
    command_queue = clCreateCommandQueue( context, device_id, 0, &err );
    err_check( err, "clCreateCommandQueue" );

    // Step 05: Create memory objects and tranfer the data to memory buffer
    cl_mem idata, odata;
    idata = clCreateBuffer( context, CL_MEM_READ_ONLY, MAT_SIZE * sizeof(float), NULL, &err );
    err = clEnqueueWriteBuffer( command_queue, idata, CL_TRUE, 0, MAT_SIZE * sizeof(float), mat_a, 0, NULL, NULL );
    err_check( err, "clEnqueueWriteBuffer" );

    odata = clCreateBuffer(context, CL_MEM_READ_WRITE, ( (MAT_SIZE/256)*sizeof(float) ), NULL, &err);

    // Step 06: Read kernel file
    ifstream file("par_sum_kernel.cl");
    string prog( istreambuf_iterator<char>( file ), ( istreambuf_iterator<char>() ) );
    const char *source_str = prog.c_str();
    // Step 07: Create Kernel program from the read in source
    program = clCreateProgramWithSource( context, 1, (const char **) &source_str, 0, &err );
    err_check( err, "clCreateProgramWithSource" );

    // Step 08: Build Kernel Program
    err = clBuildProgram( program, 1, &device_id, NULL, NULL, NULL );
    err_check( err, "clBuildProgram" );

    // Step 09: Create OpenCL Kernel
    kernel = clCreateKernel( program, "sum", &err );
    err_check( err, "clCreateKernel" );

    // Step 10: Set OpenCL kernel argument
    err = clSetKernelArg( kernel, 0, sizeof( cl_mem ), (void *) &idata );
    err_check( err, "clSetKernelArg" );

    err = clSetKernelArg( kernel, 1, sizeof( cl_mem ), (void *) &odata );
    err_check( err, "clSetKernelArg" );

    // Step 11: Execute OpenCL kernel in data parallel
    size_t GWsize[] = { MAT_SIZE, 1, 1 };
    size_t LWsize[] = {256 , 1, 1};
    clEnqueueNDRangeKernel( command_queue, kernel, 1, NULL, GWsize, LWsize, 0, 0, 0 );
    err_check( err, "clEnqueueNDRangeKernel" );
    //-----------------
    err = clSetKernelArg( kernel, 0, sizeof( cl_mem ), (void *) &idata );
    err_check( err, "clSetKernelArg" );

    err = clSetKernelArg( kernel, 1, sizeof( cl_mem ), (void *) &odata );
    err_check( err, "clSetKernelArg" );
    LWsize[0] = (MAT_SIZE/256);
    GWsize[0] = 1;

    clEnqueueNDRangeKernel( command_queue, kernel, 1, NULL, GWsize, LWsize, 0, 0, 0 );
    err_check( err, "clEnqueueNDRangeKernel" );

    //--------------------
    // Step 12: Read (Transfer result) from the memory buffer
    float mat_b[LWsize[0]];
    err = clEnqueueReadBuffer( command_queue, odata, CL_TRUE, 0, MAT_SIZE * sizeof(float), mat_b, 0, NULL, NULL );

    // Step 13: Free objects
    err = clFlush( command_queue );
    err = clFinish( command_queue );
    err = clReleaseKernel( kernel );
    err = clReleaseProgram( program );
    err = clReleaseMemObject( idata );
    err = clReleaseMemObject( odata );

    err = clReleaseCommandQueue( command_queue );
    err = clReleaseContext( context );

    // Display result
    cout<<mat_b[0];

    return 0;
    }
    </code >

    Kernel Code:
    <code cpp>
    __kernel void sum( __global float *idata, __global float *odata )
    {
    int gid = get_global_id(0);
    int lid = get_local_id(0);
    int bid = get_group_id(0);

    __local float sdata[get_num_groups(0)];
    sdata[lid] = idata[gid];
    barrier(CLK_LOCAL_MEM_FENCE);

    for( int dist = get_local_size(0); dist>0; dist/=2 )
    {
    if(lid < dist){
    sdata[lid] += sdata[lid + dist];
    barrier(CLK_LOCAL_MEM_FENCE);
    }
    }
    if(lid == 0)
    odata[bid] += sdata[0];
    }

    </code>

  8. #8

    Re: Kernel compilation problem....

    i Friends,

    Well I beginner in OpenCL..

    I am trying to sum a list of num.... But I am getting Error..

    Error: clBuildProgram(-11)

    Please, See my code and help me to Solve it...

    Code :
    #include <iostream>
    #include <cstdlib>
    #include <fstream>
    #include <string>
    //#include <CL/cl.h>
    #include <stdio.h>
    #include <stdlib.h>
     
    #ifdef __APPLE__
    #include <OpenCL/opencl.h>
    #else
    #include <CL/cl.h>
     #endif
     
     
    #define MAT_SIZE 4096
     
    using namespace std;
     
    void err_check( int err, string err_code ) {
            if ( err != CL_SUCCESS ) {
                    cout << "Error: " << err_code << "(" << err << ")" << endl;
                    exit(-1);
            }
    }
     
    int main()
    {
            cl_platform_id platform_id = NULL;
            cl_device_id device_id = NULL;
            cl_context context = NULL;
            cl_command_queue command_queue = NULL;
            cl_program program = NULL;
            cl_kernel kernel = NULL;
            cl_uint ret_num_devices;
            cl_uint ret_num_platforms;
            cl_int err;
     float mat_a[ MAT_SIZE ];
            for ( cl_int i = 0; i < MAT_SIZE; i++ ) {
                    mat_a[i] = i;
            }
     
            // Step 01: Get platform/device information
            err = clGetPlatformIDs( 1, &platform_id, &ret_num_platforms );
            err_check( err, "clGetPlatformIDs" );
     
            // Step 02: Get information about the device
            err = clGetDeviceIDs( platform_id, CL_DEVICE_TYPE_DEFAULT, 1, &device_id, &ret_num_devices );
            err_check( err, "clGetDeviceIDs" );
     
            // Step 03: Create OpenCL Context
            context = clCreateContext( NULL, 1, &device_id, NULL, NULL, &err );
            err_check( err, "clCreateContext" );
     
            // Step 04: Create Command Queue
            command_queue = clCreateCommandQueue( context, device_id, 0, &err );
            err_check( err, "clCreateCommandQueue" );
     
            // Step 05: Create memory objects and tranfer the data to memory buffer
            cl_mem idata, odata;
    		idata = clCreateBuffer( context, CL_MEM_READ_ONLY, MAT_SIZE * sizeof(float), NULL, &err );
            err = clEnqueueWriteBuffer( command_queue, idata, CL_TRUE, 0, MAT_SIZE * sizeof(float), mat_a, 0, NULL, NULL );
            err_check( err, "clEnqueueWriteBuffer" );
     
    		odata = clCreateBuffer(context, CL_MEM_READ_WRITE, ( (MAT_SIZE/256)*sizeof(float) ), NULL, &err);
     
    		 // Step 06: Read kernel file
            ifstream file("par_sum_kernel.cl");
            string prog( istreambuf_iterator<char>( file ), ( istreambuf_iterator<char>() ) );
            const char *source_str = prog.c_str();
    		        // Step 07: Create Kernel program from the read in source
            program = clCreateProgramWithSource( context, 1, (const char **) &source_str, 0, &err );
            err_check( err, "clCreateProgramWithSource" );
     
            // Step 08: Build Kernel Program
            err = clBuildProgram( program, 1, &device_id, NULL, NULL, NULL );
            err_check( err, "clBuildProgram" );
     
            // Step 09: Create OpenCL Kernel
            kernel = clCreateKernel( program, "sum", &err ); 
    		err_check( err, "clCreateKernel" );
     
            // Step 10: Set OpenCL kernel argument
            err = clSetKernelArg( kernel, 0, sizeof( cl_mem ), (void *) &idata );
    		err_check( err, "clSetKernelArg" );
     
    		err = clSetKernelArg( kernel, 1, sizeof( cl_mem ), (void *) &odata );
    		err_check( err, "clSetKernelArg" );
     
            // Step 11: Execute OpenCL kernel in data parallel
            size_t GWsize[] = { MAT_SIZE, 1, 1 };
    		size_t LWsize[] = {256 , 1, 1};
            clEnqueueNDRangeKernel( command_queue, kernel, 1, NULL, GWsize, LWsize, 0, 0, 0 );
            err_check( err, "clEnqueueNDRangeKernel" );
    //-----------------		
    		 err = clSetKernelArg( kernel, 0, sizeof( cl_mem ), (void *) &idata );
    		err_check( err, "clSetKernelArg" );
     
    		err = clSetKernelArg( kernel, 1, sizeof( cl_mem ), (void *) &odata );
    		err_check( err, "clSetKernelArg" );
    		LWsize[0] = (MAT_SIZE/256);
    		GWsize[0] = 1;
     
    		clEnqueueNDRangeKernel( command_queue, kernel, 1, NULL, GWsize, LWsize, 0, 0, 0 );
            err_check( err, "clEnqueueNDRangeKernel" );
     
    //--------------------
    		// Step 12: Read (Transfer result) from the memory buffer
    		float mat_b[LWsize[0]];
            err = clEnqueueReadBuffer( command_queue, odata, CL_TRUE, 0, MAT_SIZE * sizeof(float), mat_b, 0, NULL, NULL );
     
            // Step 13: Free objects
            err = clFlush( command_queue );
            err = clFinish( command_queue );
            err = clReleaseKernel( kernel );
            err = clReleaseProgram( program );
            err = clReleaseMemObject( idata );
            err = clReleaseMemObject( odata );
     
    	  	err = clReleaseCommandQueue( command_queue );
            err = clReleaseContext( context );
     
            // Display result
    		cout<<mat_b[0];
     
            return 0;
    }

    I Think there having error in kernel file but I am not able to find out, please help some one.


    Code :
    __kernel void sum( __global float *idata, __global float *odata )
    {
            int gid = get_global_id(0);
    		int lid = get_local_id(0);
    		int bid = get_group_id(0);
     
    		__local float sdata[get_num_groups(0)];
    		sdata[lid] = idata[gid];
    		barrier(CLK_LOCAL_MEM_FENCE);
     
    		for( int dist = get_local_size(0); dist>0; dist/=2 )
    		{
    			if(lid < dist){
    				sdata[lid] += sdata[lid + dist];
    				barrier(CLK_LOCAL_MEM_FENCE);
    			}
    		}
    		if(lid == 0)
    		odata[bid] += sdata[0];
     }

  9. #9
    Senior Member
    Join Date
    May 2010
    Location
    Toronto, Canada
    Posts
    845

    Re: Kernel compilation problem....

    The -11 error code value you get from clBuildProgram is called CL_BUILD_PROGRAM_FAILURE. That usually means that the source code you passed to clCreateProgramWithSource() is invalid.

    Can you show us the contents of par_sum_kernel.cl?
    Disclaimer: Employee of Qualcomm Canada. Any opinions expressed here are personal and do not necessarily reflect the views of my employer. LinkedIn profile.

  10. #10

    Re: Kernel compilation problem....

    Hello David,

    Very very thank you...

    As you asked for par_sum_kernel.cl

    Code :
    __kernel void sum( __global float *idata, __global float *odata )
    {
            int gid = get_global_id(0);
          int lid = get_local_id(0);
          int bid = get_group_id(0);
     
          __local float sdata[get_num_groups(0)];
          sdata[lid] = idata[gid];
          barrier(CLK_LOCAL_MEM_FENCE);
     
          for( int dist = get_local_size(0); dist>0; dist/=2 )
          {
             if(lid < dist){
                sdata[lid] += sdata[lid + dist];
                barrier(CLK_LOCAL_MEM_FENCE);
             }
          }
          if(lid == 0)
          odata[bid] += sdata[0];
    }

Page 1 of 2 12 LastLast

Similar Threads

  1. Kernel pre-compilation
    By vincentfpgarcia in forum OpenCL
    Replies: 2
    Last Post: 01-11-2012, 01:28 AM
  2. Compilation of the Kernel is always at runtime?
    By luizdrumond in forum OpenCL
    Replies: 1
    Last Post: 07-19-2011, 04:36 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •