Results 1 to 5 of 5

Thread: "clReleaseMemObject" too long

  1. #1
    Junior Member
    Join Date
    Mar 2010
    Posts
    2

    "clReleaseMemObject" too long

    Hi,

    Currently, I am writing an algorithm on 9800 GT with OpenCL. I used OpenCL Visual Profiler to watch performance and I found "clReleaseMemOBject" takes as much time than "clEnqueueReadBuffer" for the same MemObj. I just want unallocated GPU memory to liberate space, I don't need to read them.
    Do you know why "clReleaseMemObj" take so much time?
    Is it a Nvidia issue or is it the same on ATI GPU?
    Do you know an other way faster to unallocated memory?

    Thanks a lot.

  2. #2
    Member
    Join Date
    Nov 2009
    Location
    Scotland
    Posts
    72

    Re: "clReleaseMemObject" too long

    When you say "as much time than "clEnqueueReadBuffer"", do you mean a blocking or non-blocking read?

  3. #3
    Senior Member
    Join Date
    Jul 2009
    Location
    Northern Europe
    Posts
    311

    Re: "clReleaseMemObject" too long

    clReleaseMemObj should be instant. (All it does is reduce the reference count of the mem object by one.) However, this does not guarantee that the memory is freed. That will depend on whether other things are using the memory object (executing kernels, copies, etc.) and when the resource manager for the device gets around to freeing it. If clReleaseMemObject is not instant then there is a performance bug in the driver you're using. Which vendor's driver is this?

  4. #4
    Junior Member
    Join Date
    Mar 2010
    Posts
    2

    Re: "clReleaseMemObject" too long

    Sorry, I was long to reappear.

    Issues disappear vith new driver CUDA 3.0 by Nvidia. So it was obviously a driver problem.

    thanks for your help!!!

  5. #5
    Junior Member
    Join Date
    Jul 2012
    Posts
    1

    Re: "clReleaseMemObject" too long

    I'm experiencing the same issue.

    Though I'm using C++ binding (cl.hpp and cl::Buffer defined in it), it should make essentially no difference.

    (i) If I call clReleaseMemObject right after a series of kernel execution, it takes 0.17 seconds.

    (ii) If I call clEnqueueReadBuffer(blocking) between kernels and clReleaseMemObject, clEnqueueReadBuffer takes 0.17 seconds and clReleaseMemObject takes only negligible time (1e-5 sec).

    So apparently clReleaseMemObject waits for kernels to finish.
    Do you have any ideas?

    CUDA version: 4.2
    CL_DEVICE_NAME: GeForce 210
    CL_DEVICE_VERSION: OpenCL 1.0 CUDA
    CL_DRIVER_VERSION: 295.41

Similar Threads

  1. "vgSetParameterfv" vs "vgSetColor"
    By gthm159 in forum OpenVG and VGU
    Replies: 1
    Last Post: 08-15-2008, 02:28 AM
  2. "required extension" and "core addition"
    By wycwang in forum OpenGL ES general technical discussions
    Replies: 2
    Last Post: 09-19-2007, 02:11 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •