Results 1 to 5 of 5

Thread: clBuffers, synchronised memory and flags

  1. #1
    Junior Member
    Join Date
    Nov 2009
    Posts
    12

    clBuffers, synchronised memory and flags

    Hi there,

    I'm testing my application with nVidias OpenCL Visual Profiler and I'm noticing that memory buffer I'm allocating for the device using this code:

    //_bins
    _cmDevBins = cl::Buffer(_context,CL_MEM_WRITE_ONLY | CL_MEM_COPY_HOST_PTR, sizeof(cl_int) * ndX*64*ndY*64*ndZ*64,outBI,&err);

    Is being transferred to and from the device twice, that is it's getting written to the device memory, and read from the device memory for every kernel iteration.

    Now I noticed in the apple guide(http://developer.apple.com/mac/library/ ... penCL.html)
    it mentions that the two devices will be kept synchronised when using CL_MEM_USE_HOST_PTR , I assume this is also the case for CL_MEM_COPY_HOST_PTR, and I assume this is what I'm seeing here, that the buffer is being automatically syncd, hence why it is being copied out and copied back before and after a kernel executes.

    However for my application I don't want the devices to be synchronised, I only want the memory to be written to the GPU, I never want it to be read back, what flags do I need to set for this when creating the buffer?

    Thanks in advance.

  2. #2
    Junior Member
    Join Date
    Nov 2009
    Posts
    12

    Re: clBuffers, synchronised memory and flags

    Sorry I probably haven't made myself very clear so a simplified version:

    Currently I've only learnt OpenCL using Buffer objects to transfer data between the host and the GPU, however as a buffer maps host memory and device memory this means that the memory is copied to and from the GPU before and after a kernel runs.

    I'm wanting to simply put some data into the global address space on the GPU device without using a buffer object as I want it to be a one time transfer, however I'm unsure how to do this

  3. #3

    Re: clBuffers, synchronised memory and flags

    Not sure if this answers your question. But if you simply allocate your buffer without CL_MEM_COPY_HOST_PTR and use clEnqueueWriteBuffer to move the data to the GPU it should never be written back out of the GPU.

  4. #4
    Senior Member
    Join Date
    Jul 2009
    Location
    Northern Europe
    Posts
    311

    Re: clBuffers, synchronised memory and flags

    I think this is a bug in the Nvidia driver. They should only do the copy once. COPY_HOST_PTR should just copy the pointer, not reference it or keep it in sync.

  5. #5
    Junior Member
    Join Date
    Nov 2009
    Posts
    12

    Re: clBuffers, synchronised memory and flags

    So setting CL_MEM_COPY_HOST_PTR results in it being copied to and from the gpu on each run, I actually don't need to use clEnqueueWriteBuffer to copy the data over as it seems to copy it over for me,

    every single buffer object gets copied back to host regardless of any memory settings,

    This is a screenshot of the profiler output, you can see that the first three memcpy are from Host to Device/Async (note Async refers to texture memory as these are image objects), after linerMetricAddtion runs, there are 4 memory copies from D/A to Host (the 4th is a parameter I don't copy over to the device at any point but is a global memory buffer)


    Would be incredibly annoying if this is a driver bug considering nVidia have not released an opencl driver update yet. may have to revert to cuda


Similar Threads

  1. Replies: 7
    Last Post: 11-24-2010, 11:26 AM
  2. Replies: 0
    Last Post: 01-01-1970, 12:00 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •