Results 1 to 8 of 8

Thread: OpenCL/OpenGL Problems: clEnqueue{Acquire|Release}GLObjects

  1. #1
    Junior Member
    Join Date
    Nov 2011
    Posts
    5

    OpenCL/OpenGL Problems: clEnqueue{Acquire|Release}GLObjects

    Hi,
    we are working on a rendering prototype which shares 2D and 3D textures with OpenCL. The volume texture is roughly 125MiB in size. We ran into a problem with the clEnqueueAcquireGLObjects and clEnqueueReleaseGLObjects calls. They take up ~15ms each (~30ms combined!).

    This is unacceptable. We suspect that OpenCL internally duplicates the texture memory and copies the data to and from OpenGL. When only acquiring small 2D OpenGL resources the calls do not take up much frame time.

    The example how we run the kernel:

    Code :
    std::vector acq;
     
    acq.push_back(*_output_cl_image);
    acq.push_back(*vdata->volume_image());
    acq.push_back(*vdata->color_alpha_image());
     
    int arg_count = 0;
    cl_error = _ray_cast_kernel->setArg(arg_count++, *_output_cl_image); assert(!cl_error_string(cl_error).empty());
    cl_error = _ray_cast_kernel->setArg(arg_count++, *vdata->volume_image()); assert(!cl_error_string(cl_error).empty());
    cl_error = _ray_cast_kernel->setArg(arg_count++, *vdata->color_alpha_image()); assert(!cl_error_string(cl_error).empty());
    cl_error = _ray_cast_kernel->setArg(arg_count++, *vdata->volume_uniform_buffer()); assert(!cl_error_string(cl_error).empty());
     
    cl_error = context->cl_command_queue()->enqueueAcquireGLObjects(&acq); assert(!cl_error_string(cl_error).empty());
    cl_error = context->cl_command_queue()->enqueueNDRangeKernel(*_ray_cast_kernel, ::cl::NullRange, global_range, local_range, 0, 0); 
    cl_error = context->cl_command_queue()->enqueueReleaseGLObjects(&acq); assert(!cl_error_string(cl_error).empty());

    Here the test code how we measured the acquire and release times:

    Code :
    _acquire_timer.start();
    cl_error = context->cl_command_queue()->enqueueAcquireGLObjects(&acq); assert(!cl_error_string(cl_error).empty());
    context->cl_command_queue()->finish();
    _acquire_timer.stop();
     
    //cl_error = context->cl_command_queue()->enqueueNDRangeKernel(*_ray_cast_kernel, ::cl::NullRange, global_range, local_range, 0, 0); assert(!cl_error_string(cl_error).empty());
     
    _release_timer.start();
    cl_error = context->cl_command_queue()->enqueueReleaseGLObjects(&acq); assert(!cl_error_string(cl_error).empty());
    context->cl_command_queue()->finish();
    _release_timer.stop();

    These are the two read only images used by the kernel:

    Code :
        _volume_image.reset(new cl::Image3DGL(*device->cl_context(), CL_MEM_READ_ONLY,
                                                voldata->volume_raw()->object_target(), 0,
                                                voldata->volume_raw()->object_id(), &cl_error));
        _color_alpha_image.reset(new cl::Image2DGL(*device->cl_context(), CL_MEM_READ_ONLY,
                                                   voldata->color_alpha_map()->object_target(), 0,
                                                   voldata->color_alpha_map()->object_id(), &cl_error));

    This is the single write only image used by the kernel:

    Code :
            _output_cl_image.reset(new cl::Image2DGL(*device->cl_context(), CL_MEM_WRITE_ONLY,
                                                     _output_texture->object_target(), 0,
                                                     _output_texture->object_id(), &cl_error));

    As said, we are suspecting the OpenCL implementation to copy the OpenGL resources to its own memory. Maybe someone has an answer if this is really happening and if this can or will be solved in future implementations? As it is today it is sadly not usable for us...

    We are trying this on Nvidia GeForce 480/580 hardware using r285 drivers.

  2. #2
    Senior Member
    Join Date
    Sep 2002
    Location
    Santa Clara
    Posts
    105

    Re: OpenCL/OpenGL Problems: clEnqueue{Acquire|Release}GLObje

    I'm assuming you created the CL image from GL texture using clCreateFromGLTexture{2D|3D} API. clEnqueueAcquireGLObjects / clEnqueueReleaseGLObjects should not do a copy if both CL & GL are on the same GPU. Is there only 1 GPU in the system where you are encountering this issue? If so, I recommend you report the problem to the vendor where you are running into this issue.

  3. #3
    Senior Member
    Join Date
    Aug 2011
    Posts
    271

    Re: OpenCL/OpenGL Problems: clEnqueue{Acquire|Release}GLObje

    15ms is way way too long for 128mb of data anyway, so it can't just be because of a redundant copy.

    Also try a clfinish() before timing the release: you're timing the kernel run time too.

  4. #4
    Junior Member
    Join Date
    Nov 2011
    Posts
    5

    Re: OpenCL/OpenGL Problems: clEnqueue{Acquire|Release}GLObje

    Quote Originally Posted by affie
    I'm assuming you created the CL image from GL texture using clCreateFromGLTexture{2D|3D} API. clEnqueueAcquireGLObjects / clEnqueueReleaseGLObjects should not do a copy if both CL & GL are on the same GPU. Is there only 1 GPU in the system where you are encountering this issue? If so, I recommend you report the problem to the vendor where you are running into this issue.
    Yes it is just a single GPU, that is why i am so shocked by this issue. I was under the impression that OpenCL just _shares_ the resources.

    Quote Originally Posted by notzed
    15ms is way way too long for 128mb of data anyway, so it can't just be because of a redundant copy.

    Also try a clfinish() before timing the release: you're timing the kernel run time too.
    The kernel was commented out for the measurements. I also ran the test with the kernel included with additional clFinish after it, the results were the same.

  5. #5
    Junior Member
    Join Date
    Dec 2009
    Posts
    17

    Re: OpenCL/OpenGL Problems: clEnqueue{Acquire|Release}GLObje

    Quote Originally Posted by Chris Lux
    Here the test code how we measured the acquire and release times:

    Code :
    _acquire_timer.start();
    cl_error = context->cl_command_queue()->enqueueAcquireGLObjects(&acq); assert(!cl_error_string(cl_error).empty());
    context->cl_command_queue()->finish();
    _acquire_timer.stop();
     
    //cl_error = context->cl_command_queue()->enqueueNDRangeKernel(*_ray_cast_kernel, ::cl::NullRange, global_range, local_range, 0, 0); assert(!cl_error_string(cl_error).empty());
     
    _release_timer.start();
    cl_error = context->cl_command_queue()->enqueueReleaseGLObjects(&acq); assert(!cl_error_string(cl_error).empty());
    context->cl_command_queue()->finish();
    _release_timer.stop();
    You are timing the kernel execution and the context switch using the code you have provided. enqueueNDRangeKernel will return almost immediately and then the kernel runs later. If you want to actually time the release you should add a callback to the enqueueNDRangeKernel to be notified when it completes then start the timer.

    Edit: I see you mentioned that you have commented out the kernel call. You might try using events to wait for enqueue and release of the GL objects. clFinish() seems to do much than just waiting for simply waiting for the events (in my experience it can provide a relatively big performance advantage). Also have you called glFinish() before you start timing to make sure you are not waiting for opengl to finish up?

    Another thing you might try your timing after you have run a warm up test to make sure the card has turned off all powersaving measures. Fermi cards while power beasts try to turn off transistors any chance they get.

    Hope that helps.

  6. #6
    Senior Member
    Join Date
    Aug 2011
    Posts
    271

    Re: OpenCL/OpenGL Problems: clEnqueue{Acquire|Release}GLObje

    Ahh right, sorry I thought that was a paste-o.

    Ask the vendor I guess.

    It is probably hitting some limit and swapping device memory around with the host. I would guess that 15ms pretty much suggests it's crossing the PCIe bus. If you have time (to waste!) you could perhaps see if there's a point at which this limit occurs and re-arrange the code to be within the limits, or buy different hardware.

  7. #7
    Junior Member
    Join Date
    Nov 2011
    Posts
    5

    Re: OpenCL/OpenGL Problems: clEnqueue{Acquire|Release}GLObje

    ok, some additions.

    we see the mentioned behavior under Windows 7, the exact same program running under Linux does not show the high times required for the acquire and release actions.

    I contacted Nvidia and their answer was something like this: "Try CUDA and we will help you solve your performance problems"... No thank you, it was a conscious decision to use OpenCL over CUDA.

  8. #8
    Senior Member
    Join Date
    Aug 2011
    Posts
    271

    Re: OpenCL/OpenGL Problems: clEnqueue{Acquire|Release}GLObje

    Quote Originally Posted by Chris Lux
    ok, some additions.

    we see the mentioned behavior under Windows 7, the exact same program running under Linux does not show the high times required for the acquire and release actions.

    I contacted Nvidia and their answer was something like this: "Try CUDA and we will help you solve your performance problems"... No thank you, it was a conscious decision to use OpenCL over CUDA.
    Hah! How nice of them, although hardly surprising. We recently ditched them for amd, but they have their own troubles and fairly different performance characteristics to deal with ...

    I had some troubles with windows and opencl from Java - but it was due to Swing using direct2d in places (i presume) and requiring very slow context switches.

Similar Threads

  1. Replies: 2
    Last Post: 02-21-2013, 03:46 AM
  2. OpenCL Spec: release/retain errors
    By guillona in forum OpenCL
    Replies: 0
    Last Post: 01-17-2011, 07:11 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •