Page 1 of 2 12 LastLast
Results 1 to 10 of 15

Thread: Time of clReleaseMemObject : strange behaviour

  1. #1
    Junior Member
    Join Date
    Jul 2009
    Posts
    11

    Time of clReleaseMemObject : strange behaviour

    Hi,
    I have a strange behaviour with time of computing with function clReleaseMemObject
    First example
    Code :
    float *pfTest = (float *)calloc(50 * 500000, sizeof(float));
    cl_mem *clTest = clCreateBuffer(Context, CL_MEM_READ_WRITE | CL_MEM_COPY_HOST_PTR, sizeof(cl_float) * 500000 * 50, pfTest, NULL)
    clReleaseMemObject(clTest);
    In this case time of clReleaseMemObject is 0.01s
    Second Example
    Code :
    float *pfTest = (float *)calloc(50 * 500000, sizeof(float));
    cl_mem *clTest = clCreateBuffer(Context, CL_MEM_READ_WRITE | CL_MEM_COPY_HOST_PTR, sizeof(cl_float) * 500000 * 50, pfTest, NULL)
    cl_mem *clTest2 = clCreateBuffer(Context, CL_MEM_READ_WRITE | CL_MEM_COPY_HOST_PTR, sizeof(cl_float) * 500000 * 50, pfTest, NULL)
    Here i Run a Kernel with parameter clTest and clTest2 (for example clTest[iID] = clTest2[iId])
    clReleaseMemObject(clTest);
    In the second case time of clReleaseMemObject is 0.2s
    Why the fact to add operation Kernel on clTest made that the time of clReleaseMemObject is bigger ????
    Thx
    J

  2. #2
    Senior Member
    Join Date
    Jul 2009
    Location
    Northern Europe
    Posts
    311

    Re: Time of clReleaseMemObject : strange behaviour

    Performance issues like this depend entirely on the implementation. You'll have to ask the vendor who provided your SDK.

    However, I can imagine several things that might cause this. If you just create a buffer the implementation doesn't have to move it to the device. If you execute the kernel it does. So if you execute a kernel and then release it, the implementation may have to clean up in two places (on the host and the device) whereas if you do not execute a kernel it might only have to clean up on the host. Of course this is completely dependent on the vendor's implementation.

  3. #3
    Junior Member
    Join Date
    Jul 2009
    Posts
    11

    Re: Time of clReleaseMemObject : strange behaviour

    Hi,
    If i take your idea to release memory on the buffer and on the device :
    Why the time of free memory on the device when i launch kernel is :0.19s =(0.20 - 0.01), it's huge
    I use OpenCL 1.0 release candidate of NVIDIA (gtx 285)
    Thx
    J

  4. #4
    Junior Member
    Join Date
    Jul 2009
    Posts
    20

    Re: Time of clReleaseMemObject : strange behaviour

    Quote Originally Posted by dbs2
    Performance issues like this depend entirely on the implementation. You'll have to ask the vendor who provided your SDK.

    However, I can imagine several things that might cause this. If you just create a buffer the implementation doesn't have to move it to the device.
    This is implementation defined, isn't it? I was considering the time spent on releasing the buffer objects. In cases, when you are in a loop and you have to call the kernel multiple times, isn't it better to create the buffer object once outside the loop, and then do a WriteBuffer to the same buffer everytime inside the loop with new data? The only issue is if one wants to change the buffer size across iterations, since we can't resize the buffer, in which case you would have to release the buffer for every iteration and recreate the buffer for the new buffer size.

  5. #5
    Junior Member
    Join Date
    Jul 2009
    Posts
    11

    Re: Time of clReleaseMemObject : strange behaviour

    What's the link with my question ?
    i create the buffer + 1 kernel + 1 release
    Time of release is 0.2s (size : 50 * 500 000 * sizeof(float) ) !!! with one Kernel whereas is 0.01s without this kernel
    Make the test with your card and you will see this bug
    edit : maybe you answer is linked to viewtopic.php?f=28&t=1978 you have made a mistake when you use reply...

  6. #6
    Junior Member
    Join Date
    Jul 2009
    Posts
    11

    Re: Time of clReleaseMemObject : strange behaviour

    Hi,
    Nobody finds that clreleasememobject takes many times of computation ?
    It's strange that i'm the only person with this problem
    Example :
    my code one 1 core CPU intel XEON 90sec
    with GTX 285 and OpenCL : 2.5sec (but in this 2.5sec there is 1.2sec of clReleaseMemObject it's very strange)
    Help
    Thx
    J

  7. #7
    Senior Member
    Join Date
    Jul 2009
    Location
    Northern Europe
    Posts
    311

    Re: Time of clReleaseMemObject : strange behaviour

    Jonathan,

    I do find it strange that clReleaseMemObject() takes so long, but I doubt this is a function of OpenCL per se. This sounds very much like a performance bug with the specific OpenCL implementation you are using. I would suggest filing a report with the vendor who provided your OpenCL implementation to see if they can reproduce it and resolve it.

    You might also try to narrow it down by trying another device (for example try running on the CPU, if the vendor supports it) or another vendor's implementation and seeing if you encounter the same problem. This would help you narrow it down to the vendor, the device, or the machine. I'm sorry I can't be of more help.

  8. #8
    Junior Member
    Join Date
    Jul 2009
    Posts
    11

    Re: Time of clReleaseMemObject : strange behaviour

    Hi dbs2,
    Thanks for your help
    I use OpenCL SDK of NVIDIA 1.0
    However this SDK don't support for this moment CPU computation.
    I will try with AMD SDK on CPU (as soon as it will be available)
    I think that's a problem of Nvidia SDK.
    I just remember that OPEN CL sdk it's only provided by NVIDIA... so it's complicate to change my sdk...
    i have tried with another card and it's the same problem.
    Moreover, most of people on this forum uses SDK nvidia, so most of people has the same problem with clReleaseMemObject.
    And in the nvidia forum for OpenCL, it's impossible to have a answer mades by a specialist of implementation of OPENCL sdk...
    Maybe you must divide this forum in 3 part (AMD, NVIDIA and MAC OS X)...

  9. #9
    Senior Member
    Join Date
    Jul 2009
    Location
    Northern Europe
    Posts
    311

    Re: Time of clReleaseMemObject : strange behaviour

    Jonathan,
    I would suggest trying this on a Mac with SnowLeopard to with a 285 if you can. That will give you a good data point for determining if this is problem with Nvidia's CL implementation or your code. (I realize this may be difficult...)

  10. #10
    Junior Member
    Join Date
    Jul 2009
    Posts
    11

    Re: Time of clReleaseMemObject : strange behaviour

    Hi,
    In fact you will see the same problem with the vectorAdd example in the SDK
    If you look the profiler you will see so many memcpyDtoH during the release of the memory
    Moreover, with big data for the vector the time of releasing the command Queue is Huge!!!
    Kernel Time : 0.18s
    Release Command Queue : 0.6s
    i think it's a bug with the nvidia driver
    Thx
    J

Page 1 of 2 12 LastLast

Similar Threads

  1. Replies: 2
    Last Post: 10-14-2012, 08:41 AM
  2. clCreateSubBuffer and clreleaseMemObject
    By nachovall in forum OpenCL
    Replies: 4
    Last Post: 11-22-2011, 06:24 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •