Results 1 to 9 of 9

Thread: OpenCl and Linux

  1. #1
    Junior Member
    Join Date
    Jan 2011
    Posts
    8

    OpenCl and Linux

    Hi

    I am new in writing openCL programs and so i have a first question. I want to write a hello world program in Linux on a laptop which has only a Intel GMA chip but i want to run it on a PC with an Nvidia card. Now my question is how do i install the OpenCL Framework on my laptop? I want to use the normal gcc compiler for the development. Did somebody knows a good tutorial where the installation is described?

    Thanks for your help.

    Best regards

    Harald

  2. #2

    Re: OpenCl and Linux

    Get the OpenCL SDK from AMD, look around at http://developer.amd.com/zones/OpenC...s/default.aspx, and make sure that you do all contex/device handling as portable as possible. There are plenty of documentation and examples on how to do this. It should then be as simple as just running the application on any OpenCL enabled machine.

  3. #3
    Junior Member
    Join Date
    Jan 2011
    Posts
    8

    Re: OpenCl and Linux

    Hi

    I used now the AMD OpenCL implementation and set the right environment varaibales and now i am able to compile a code with:

    gcc helloworld.c -o helloworld -lOpenCL

    But when i start the program i get alway the error "Failed to create a devce group!" (CL_INVALID_PLATFORM). I tried the program on a laptop with a Intel GMA and on a laptop with a NVIDIA card but i got the error on both machines. The code is at the end of this posting!

    Best regards
    Harald

    #include <fcntl.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <math.h>
    #include <unistd.h>
    #include <sys/types.h>
    #include <sys/stat.h>
    #include <CL/opencl.h>

    #define DATA_SIZE (1024)



    int main(int argc, char** argv)
    {
    int err; // error code returned from api calls

    float data[DATA_SIZE]; // original data set given to device
    float results[DATA_SIZE]; // results returned from device
    unsigned int correct; // number of correct results returned

    size_t global; // global domain size for our calculation
    size_t local; // local domain size for our calculation

    cl_platform_id platform_id;

    cl_device_id device_id; // compute device id
    cl_context context; // compute context
    cl_command_queue commands; // compute command queue
    cl_program program; // compute program
    cl_kernel kernel; // compute kernel

    cl_mem input; // device memory used for the input array
    cl_mem output; // device memory used for the output array

    // Fill our data set with random float values
    //
    int i = 0;
    unsigned int count = DATA_SIZE;
    for(i = 0; i < count; i++)
    data[i] = rand() / (float)RAND_MAX;

    // Connect to a compute device
    //
    int gpu = 1;

    clGetPlatformIDs(1, &platform_id, NULL);

    err = clGetDeviceIDs(platform_id, gpu ? CL_DEVICE_TYPE_GPU : CL_DEVICE_TYPE_CPU, 1, &device_id, NULL); //
    if (err != CL_SUCCESS)
    {
    if(err == CL_INVALID_PLATFORM)
    printf("CL_INVALID_PLATFORM\n");
    if(err == CL_INVALID_DEVICE_TYPE)
    printf("CL_INVALID_DEVICE_TYPE\n");
    if(err == CL_INVALID_VALUE)
    printf("CL_INVALID_VALUE\n");
    if(err == CL_DEVICE_NOT_FOUND)
    printf("CL_DEVICE_NOT_FOUND\n");
    printf("Error: Failed to create a device group!\n");
    return EXIT_FAILURE;
    }

    }

  4. #4
    Senior Member
    Join Date
    May 2010
    Location
    Toronto, Canada
    Posts
    845

    Re: OpenCl and Linux

    The code you posted is looking for AMD GPUs that support OpenCL in your system and it is not finding any. Why don't you try searching for CPU devices instead? That way you can run and debug your application on your Intel-based computer and later run it on your NVidia-based computer.

    For starters I would simply change this line of code:

    Code :
    err = clGetDeviceIDs(CL_DEVICE_TYPE_ALL, 1, &device_id, NULL);
    Disclaimer: Employee of Qualcomm Canada. Any opinions expressed here are personal and do not necessarily reflect the views of my employer. LinkedIn profile.

  5. #5

    Re: OpenCl and Linux

    In addition to what David said, clGetPlatformIDs can give you both an error code and the number of platforms available. You can do something like the following.

    Code :
    cl_uint numPlatforms;
    cl_int status;
    status = clGetPlatformIDs(1, &platform_id, &numPlatforms);
    if(status != CL_SUCCESS)
      // Error handling here.
    if(numPlatforms == 0)
      // Error handling here.

  6. #6
    Junior Member
    Join Date
    Jan 2011
    Posts
    8

    Re: OpenCl and Linux

    So now i use the AMD stream on my laptop and the compilation works fine and i can start the program but i got the following error and i dont know how i can solve the problem:

    Id of the platform: 1

    OpenCL demo application started!
    clCreateContext
    clCreateCommandQueue
    clCreateProgramWithSource
    Error: Failed to build program executable!
    /tmp/OCLSIKDpf.cl(2): warning: explicit type is missing ("int" assumed)
    __kernel square(
    ^

    /tmp/OCLSIKDpf.cl(2): error: kernel must return void
    __kernel square(
    ^

    1 error detected in the compilation of "/tmp/OCLSIKDpf.cl".

    Best regards

    Harald

    Code :
    //
    // File:       hello.c
    //
    // Abstract:   A simple "Hello World" compute example showing basic usage of OpenCL which
    //             calculates the mathematical square (X[i] = pow(X[i],2)) for a buffer of
    //             floating point values.
    //             
     
    ////////////////////////////////////////////////////////////////////////////////
     
    #include <fcntl.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <math.h>
    #include <unistd.h>
    #include <sys/types.h>
    #include <sys/stat.h>
    #include <CL/opencl.h>
     
    ////////////////////////////////////////////////////////////////////////////////
     
    // Use a static data size for simplicity
    //
    #define DATA_SIZE (1024)
     
    ////////////////////////////////////////////////////////////////////////////////
     
    // Simple compute kernel which computes the square of an input array 
    //
    const char *KernelSource = "\n" \
    "__kernel square(                                                       \n" \
    "   __global float* input,                                              \n" \
    "   __global float* output,                                             \n" \
    "   const unsigned int count)                                           \n" \
    "{                                                                      \n" \
    "   int i = get_global_id(0);                                           \n" \
    "   if(i < count)                                                       \n" \
    "       output[i] = input[i] * input[i];                                \n" \
    "}                                                                      \n" \
    "\n";
     
    ////////////////////////////////////////////////////////////////////////////////
     
     
     
    int main(int argc, char** argv)
    {
        int err;                            // error code returned from api calls
     
        float data[DATA_SIZE];              // original data set given to device
        float results[DATA_SIZE];           // results returned from device
        unsigned int correct;               // number of correct results returned
     
        size_t global;                      // global domain size for our calculation
        size_t local;                       // local domain size for our calculation
     
        cl_platform_id platform_id;
        cl_uint num_id; 
     
        cl_device_id device_id;             // compute device id 
        cl_context context;                 // compute context
        cl_command_queue commands;          // compute command queue
        cl_program program;                 // compute program
        cl_kernel kernel;                   // compute kernel
     
        cl_mem input;                       // device memory used for the input array
        cl_mem output;                      // device memory used for the output array
     
        // Fill our data set with random float values
        //
        int i = 0;
        unsigned int count = DATA_SIZE;
        for(i = 0; i < count; i++)
            data[i] = rand() / (float)RAND_MAX;
     
        // Connect to a compute device
        //
        int gpu = 1;
     
        err = clGetPlatformIDs(1, &platform_id, &num_id);
        if(err != CL_SUCCESS)
        {
    	printf("Failed to get the ID of the platform (%i)\n", num_id);
            return EXIT_FAILURE;
        }
        printf("Id of the platform: %i\n",num_id);
     
        err = clGetDeviceIDs(platform_id, CL_DEVICE_TYPE_ALL, 1, &device_id, NULL); // gpu ? CL_DEVICE_TYPE_GPU : CL_DEVICE_TYPE_CPU
        if (err != CL_SUCCESS)
        {
    	if(err == CL_INVALID_PLATFORM)
    		printf("CL_INVALID_PLATFORM\n");
    	if(err == CL_INVALID_DEVICE_TYPE)
    		printf("CL_INVALID_DEVICE_TYPE\n");
    	if(err == CL_INVALID_VALUE)
    		printf("CL_INVALID_VALUE\n");
    	if(err == CL_DEVICE_NOT_FOUND)
    		printf("CL_DEVICE_NOT_FOUND\n");
            printf("Error: Failed to create a device group!\n");
            return EXIT_FAILURE;
        }
        printf("\nOpenCL demo application started!\n"); 
     
     // Create a compute context 
        //
        context = clCreateContext(0, 1, &device_id, NULL, NULL, &err);
        if (!context)
        {
            printf("Error: Failed to create a compute context!\n");
            return EXIT_FAILURE;
        }
     
        printf("clCreateContext\n");
     
        // Create a command commands
        //
        commands = clCreateCommandQueue(context, device_id, 0, &err);
        if (!commands)
        {
            printf("Error: Failed to create a command commands!\n");
            return EXIT_FAILURE;
        }
     
        printf("clCreateCommandQueue\n");
     
        // Create the compute program from the source buffer
        //
        program = clCreateProgramWithSource(context, 1, (const char **) & KernelSource, NULL, &err);
        if (!program)
        {
            printf("Error: Failed to create compute program!\n");
            return EXIT_FAILURE;
        }
     
        printf("clCreateProgramWithSource\n");
     
        // Build the program executable
        //
        err = clBuildProgram(program, 0, NULL, NULL, NULL, NULL);
        if (err != CL_SUCCESS)
        {
            size_t len;
            char buffer[2048];
     
            printf("Error: Failed to build program executable!\n");
            clGetProgramBuildInfo(program, device_id, CL_PROGRAM_BUILD_LOG, sizeof(buffer), buffer, &len);
            printf("%s\n", buffer);
            exit(1);
        }
     
        printf("clBuildProgram\n");
     
        // Create the compute kernel in the program we wish to run
        //
        kernel = clCreateKernel(program, "square", &err);
        if (!kernel || err != CL_SUCCESS)
        {
            printf("Error: Failed to create compute kernel!\n");
            exit(1);
        }
     
        printf("clCreateKernel\n");
     
        // Create the input and output arrays in device memory for our calculation
        //
        input = clCreateBuffer(context,  CL_MEM_READ_ONLY,  sizeof(float) * count, NULL, NULL);
        output = clCreateBuffer(context, CL_MEM_WRITE_ONLY, sizeof(float) * count, NULL, NULL);
        if (!input || !output)
        {
            printf("Error: Failed to allocate device memory!\n");
            exit(1);
        }    
     
        printf("clCreateBuffer\n");
     
        // Write our data set into the input array in device memory 
        //
        err = clEnqueueWriteBuffer(commands, input, CL_TRUE, 0, sizeof(float) * count, data, 0, NULL, NULL);
        if (err != CL_SUCCESS)
        {
            printf("Error: Failed to write to source array!\n");
            exit(1);
        }
     
        printf("clEnqueueWriteBuffer\n");
     
        // Set the arguments to our compute kernel
        //
        err = 0;
        err  = clSetKernelArg(kernel, 0, sizeof(cl_mem), &input);
        err |= clSetKernelArg(kernel, 1, sizeof(cl_mem), &output);
        err |= clSetKernelArg(kernel, 2, sizeof(unsigned int), &count);
        if (err != CL_SUCCESS)
        {
            printf("Error: Failed to set kernel arguments! %d\n", err);
            exit(1);
        }
     
        // Get the maximum work group size for executing the kernel on the device
        //
        err = clGetKernelWorkGroupInfo(kernel, device_id, CL_KERNEL_WORK_GROUP_SIZE, sizeof(local), &local, NULL);
        if (err != CL_SUCCESS)
        {
            printf("Error: Failed to retrieve kernel work group info! %d\n", err);
            exit(1);
        }
     
        // Execute the kernel over the entire range of our 1d input data set
        // using the maximum number of work group items for this device
        //
        global = count;
        err = clEnqueueNDRangeKernel(commands, kernel, 1, NULL, &global, &local, 0, NULL, NULL);
        if (err)
        {
            printf("Error: Failed to execute kernel!\n");
            return EXIT_FAILURE;
        }
     
        // Wait for the command commands to get serviced before reading back results
        //
        clFinish(commands);
     
        // Read back the results from the device to verify the output
        //
        err = clEnqueueReadBuffer( commands, output, CL_TRUE, 0, sizeof(float) * count, results, 0, NULL, NULL );  
        if (err != CL_SUCCESS)
        {
            printf("Error: Failed to read output array! %d\n", err);
            exit(1);
        }
     
        // Validate our results
        //
        correct = 0;
        for(i = 0; i < count; i++)
        {
            if(results[i] == data[i] * data[i])
                correct++;
        }
     
        // Print a brief summary detailing the results
        //
        printf("Computed '%d/%d' correct values!\n", correct, count);
     
        // Shutdown and cleanup
        //
        clReleaseMemObject(input);
        clReleaseMemObject(output);
        clReleaseProgram(program);
        clReleaseKernel(kernel);
        clReleaseCommandQueue(commands);
        clReleaseContext(context);
     
        return 0;
    }

  7. #7
    Senior Member
    Join Date
    May 2010
    Location
    Toronto, Canada
    Posts
    845

    Re: OpenCl and Linux

    You forgot the return type of kernel square. As of today all kernel functions must return void, so the correct function declaration would be "__kernel void square(...)".

    This is what your program source should look like:

    Code :
    // Simple compute kernel which computes the square of an input array 
    //
    const char *KernelSource = "\n" \
    "__kernel void square(                                                       \n" \
    "   __global float* input,                                              \n" \
    "   __global float* output,                                             \n" \
    "   const unsigned int count)                                           \n" \
    "{                                                                      \n" \
    "   int i = get_global_id(0);                                           \n" \
    "   if(i < count)                                                       \n" \
    "       output[i] = input[i] * input[i];                                \n" \
    "}                                                                      \n" \
    "\n";
    Disclaimer: Employee of Qualcomm Canada. Any opinions expressed here are personal and do not necessarily reflect the views of my employer. LinkedIn profile.

  8. #8

    Re: OpenCl and Linux

    I've always glossed over the return value and set it to "void", but is there a way to check the return value of a kernel? Not sure how this would be very useful, curious nonetheless...

  9. #9
    Senior Member
    Join Date
    May 2010
    Location
    Toronto, Canada
    Posts
    845

    Re: OpenCl and Linux

    I've always glossed over the return value and set it to "void", but is there a way to check the return value of a kernel?
    All kernel functions must return void. If they are declared with a different return type the compiler should fail compilation.
    Disclaimer: Employee of Qualcomm Canada. Any opinions expressed here are personal and do not necessarily reflect the views of my employer. LinkedIn profile.

Similar Threads

  1. OpenCL profiling tools for Linux
    By EyNuel in forum OpenCL
    Replies: 3
    Last Post: 10-13-2012, 10:20 PM
  2. OpenCL on Linux, which implementation to choose?
    By Dr. Jan Itor in forum OpenCL
    Replies: 6
    Last Post: 03-08-2011, 11:29 PM
  3. OpenCL only for CPU (linux)
    By jeanluca in forum OpenCL
    Replies: 4
    Last Post: 01-05-2010, 12:34 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •