Results 1 to 5 of 5

Thread: Kernel output not being saved in host RAM

  1. #1
    Junior Member
    Join Date
    Jan 2013
    Posts
    3

    Kernel output not being saved in host RAM

    Hi,

    I am trying out OpenCL on my nVidia GTX660. It looks great, but I am having an issue with CL_MEM_USE_HOST_PTR buffers. I use this flag on all my buffers : it works on input buffers but my kernel cannot write the output data in RAM.

    I tried to use clEnqueueReadBuffer() after kernel execution and it works so what's going on ? Does this mean that the buffer is in the GPU RAM ? Why ?
    Here are my relevant functions.

    Code :
    void OCLInterface::LoadKernel(string file, size_t size)
    {
            char* source;
            size_t fileSize;
     
            source = LoadSource(file.c_str(), &fileSize);
            program = clCreateProgramWithSource(context, 1, (const char **)&source, &fileSize, &err);
            free(source);
            CHK();
     
            err = clBuildProgram(program, 0, NULL, NULL, NULL, NULL);
            CHK();
            kernel = clCreateKernel(program, file.c_str(), &err);
            CHK();
     
            currentSize = size;
            currentArgCount = 0;
    }
     
    void OCLInterface::AddParam(float* data, bool bInput)
    {
            cl_mem buffer = clCreateBuffer(
                    context,
                    (bInput ? CL_MEM_READ_ONLY:CL_MEM_WRITE_ONLY) | CL_MEM_USE_HOST_PTR,
                    VSIZE * sizeof(float),
                    data,
                    &err
            );
            CHK();
            err = clSetKernelArg(kernel, currentArgCount, sizeof(cl_mem), (void*)&buffer);
            CHK();
            buffers.push_back(buffer);
            currentArgCount++;
    }
     
    void OCLInterface::Exec(int size)
    {
            size_t localsize = size;
            size_t globalsize = ceil(VSIZE / (float)localsize) * localsize;
            err = clEnqueueNDRangeKernel(queue, kernel, 1, NULL, &globalsize, &localsize, 0, NULL, NULL);
            CHK();
            err = clFinish(queue);
            CHK();
    }
     
    // My code
     
    float* A = (float*)malloc(VSIZE * sizeof(float));
    float* B = (float*)malloc(VSIZE * sizeof(float));
    float* C = (float*)malloc(VSIZE * sizeof(float));
    // snip : array init (a loop)
     
    openCL.LoadKernel("add", VSIZE);
    openCL.AddParam(A, true);
    openCL.AddParam(B, true);
    openCL.AddParam(C, false);
    openCL.Exec(128);
    At this point C[] is still at its initialization value (0 in my case but depending on the array init code).
    And now my kernel, which should do a simple vector addition.

    Code :
    __kernel void add(
    	__global const float* input1,
    	__global const float* input2,
    	__global float* output)
    {
        unsigned int index = get_global_id(0);
        output[index] = input1[index] + input2[index];
    }

    Thanks !

  2. #2
    Senior Member
    Join Date
    Oct 2012
    Posts
    112

    Re: Kernel output not being saved in host RAM

    You always have to use clEnqueueReadBuffer(), even with CL_MEM_USE_HOST_PTR buffers.

    As stated in the specification: "OpenCL implementations are allowed to cache the buffer contents pointed to by host_ptr in device memory. This cached copy can be used when kernels are executed on a device."

    There is no automatic synchronization between host memory and device memory with CL_MEM_USE_HOST_PTR buffers, so a clEnqueueReadBuffer() or clEnqueueMapBuffer() is necessary to flush the device cache to the host buffer.

  3. #3
    Junior Member
    Join Date
    Jan 2013
    Posts
    3

    Re: Kernel output not being saved in host RAM

    Thank you for this explanation !
    If I read your quote right, the host buffer is cached onto the GPU RAM, then (if used as an output) written but not transferred back.

    If the host buffer is modified, will its contents be cached again on the GPU or not ?

  4. #4
    Senior Member
    Join Date
    Oct 2012
    Posts
    112

    Re: Kernel output not being saved in host RAM

    If you write directly in the host memory of the buffer, you'll then have to use clEnqueueWriteBuffer() to update the buffer (or you can instead write to host memory inside a clEnqueueMapBuffer() ... clEnqueueUnmapMemObject() sequence).

  5. #5
    Junior Member
    Join Date
    Jan 2013
    Posts
    3

    Re: Kernel output not being saved in host RAM

    I went with memory pinning as you said, it works great. Thank you !

Similar Threads

  1. Running kernel on host vs device
    By jimmyz500 in forum OpenCL
    Replies: 7
    Last Post: 12-11-2012, 08:32 AM
  2. Replies: 7
    Last Post: 05-29-2011, 07:11 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •