Results 1 to 10 of 10

Thread: Wondering when I should use clFlush or clFinish.

  1. #1

    Wondering when I should use clFlush or clFinish.

    My kernels take about 5 seconds to run with clFinish() after each of them is enqueued. When I removed all the clFinish(), it takes only 2.2 seconds while the results are exactly the same. I only used a single command queue, and in this case do I have to call clFinish or clFlush?

    The spec doesn't seem to explain how a command queue works in detail. According to it, although clEnqueueReadBuffer performs an implicit flush, there is no guarantee that the queue will be complete after clFlush returns. That sounds to me that anyway a clFnish() has to be called in order to ensure all the tasks in a queue are finished before calling clEnqueueReadBuffer to transfer the data back to CPU.

    So could anyone tell me why I still got correct results after all the clFinish() have been removed? Is it just an accident or this is the right way to use OpenCL?

    Thanks in advance.

  2. #2
    Senior Member
    Join Date
    May 2010
    Location
    Toronto, Canada
    Posts
    845

    Re: Wondering when I should use clFlush or clFinish.

    The spec doesn't seem to explain how a command queue works in detail. According to it, although clEnqueueReadBuffer performs an implicit flush, there is no guarantee that the queue will be complete after clFlush returns.
    Quick rule of thumb: for most applications it's not necessary or to call clFlush() or clFinish() at all. Doing a final blocking call to clEnqueueReadBuffer() or clEnqueueMapBuffer() to read back your data is enough.

    Only blocking calls to clEnqueueReadBuffer() (and similar) perform an implicit flush. They also guarantee that the command will be complete before the call returns to the application. See these snippets from the spec:

    “Any blocking commands queued in a command-queue and clReleaseCommandQueue perform
    an implicit flush of the command-queue. These blocking commands are clEnqueueReadBuffer, clEnqueueReadBufferRect, clEnqueueReadImage, with blocking_read set to CL_TRUE; clEnqueueWriteBuffer, clEnqueueWriteBufferRect, clEnqueueWriteImage with blocking_write set to CL_TRUE; clEnqueueMapBuffer, clEnqueueMapImage with
    blocking_map set to CL_TRUE;
    or clWaitForEvents.”
    [5.13]. In other words, only blocking reads implicitly flush the queue. Non-blocking reads do not flush the queue.

    “If blocking_read is CL_TRUE i.e. the read command is blocking, clEnqueueReadBuffer does
    not return until the buffer data has been read and copied into memory pointed to by ptr
    [5.2.2]. That is, when a blocking call to clEnqueueReadBuffer() has returned to the application, the event associated with that command is already complete.
    Disclaimer: Employee of Qualcomm Canada. Any opinions expressed here are personal and do not necessarily reflect the views of my employer. LinkedIn profile.

  3. #3
    Senior Member
    Join Date
    Aug 2011
    Posts
    271

    Re: Wondering when I should use clFlush or clFinish.

    Quote Originally Posted by mercuryknight
    My kernels take about 5 seconds to run with clFinish() after each of them is enqueued. When I removed all the clFinish(), it takes only 2.2 seconds while the results are exactly the same. I only used a single command queue, and in this case do I have to call clFinish or clFlush?
    Think of the command queue as a shopping list.

    You don't write one item down, go to the shop, find it, buy it, bring it back, pack it in the fridge, then sit down and add another item to the list and repeat do you? It would take forever - but this is precisely what your first test case is doing. You're adding the 'travelling time' on-top of the 'finding and buying time' for every item.

    clEnqueue*() - adding an item to the bottom of the list.
    clflush() - leaving the house to go to the shop.
    clfinish() - returning to the house with a basket of stuff.

    If you just write down all the items on the list, and then go to the shop - you only have to count the travel time once. This grouping happens at every stage - e.g. you use a basked in the shop so you only have to check-out once, a car so you can carry it all, etc.

    If you're doing a really large amount of work, clFinish() can send the kids down on a bicycle with a partial shopping list to get started whilst you're still busy completing the list or doing other things. If you break it up properly and have enough to do (and enough kids), you can end up with everyone being busy along the 'pipeline', fully utilising every part of the system. e.g. someone at the checkout, someone scanning the shelves, a few people in transit in each direction, etc. It might take a while to get the first result, but after that you get a steady 'stream' of 'stuff' coming back - at a rate much higher than if you did it in individual trips, even with a car, and with higher efficiently to boot.

  4. #4
    Senior Member
    Join Date
    May 2010
    Location
    Toronto, Canada
    Posts
    845

    Re: Wondering when I should use clFlush or clFinish.

    clFinish() can send the kids down on a bicycle with a partial shopping list to get started whilst you're still busy completing the list or doing other things
    In that scenario you want a clFlush(), not a clFinish(). A clFinish() won't return to the app until all work has finished completely, thereby preventing the app from doing work concurrently while the GPU is busy. clFlush() does allow both the app and the device to work at the same time.

    Even then, for beginners it's better to stick to the simple rule of never calling either clFlush() nor clFinish(). I've never seen an example in this forum of a clFinish() that was truly necessary nor a clFlush() that was justified.
    Disclaimer: Employee of Qualcomm Canada. Any opinions expressed here are personal and do not necessarily reflect the views of my employer. LinkedIn profile.

  5. #5

    Re: Wondering when I should use clFlush or clFinish.

    Quote Originally Posted by david.garcia
    “If blocking_read is CL_TRUE i.e. the read command is blocking, clEnqueueReadBuffer does
    not return until the buffer data has been read and copied into memory pointed to by ptr
    [5.2.2]. That is, when a blocking call to clEnqueueReadBuffer() has returned to the application, the event associated with that command is already complete.
    Thanks a lot.
    It seems to me that a blocking call to clEnqueueReadBuffer() only 'blocks' itself, but cannot make other tasks, which are already in the command queue, blocking. So is there any possibility that the ReadBuffer operation starts before all the other tasks are finished?

    Cheers

  6. #6

    Re: Wondering when I should use clFlush or clFinish.

    Quote Originally Posted by notzed

    clEnqueue*() - adding an item to the bottom of the list.
    clflush() - leaving the house to go to the shop.
    clfinish() - returning to the house with a basket of stuff.
    Thanks for your simile. So the problem is, what happened to me is that I just wrote down a shopping list, without leaving the house to shopping, nor returning with a basket. Things I want to buy turned up in my house automatically.

  7. #7
    Senior Member
    Join Date
    Aug 2011
    Posts
    271

    Re: Wondering when I should use clFlush or clFinish.

    Quote Originally Posted by david.garcia
    clFinish() can send the kids down on a bicycle with a partial shopping list to get started whilst you're still busy completing the list or doing other things
    In that scenario you want a clFlush(), not a clFinish(). A clFinish() won't return to the
    Ahh yeah, just a typo there.

  8. #8
    Senior Member
    Join Date
    May 2010
    Location
    Toronto, Canada
    Posts
    845

    Re: Wondering when I should use clFlush or clFinish.

    It seems to me that a blocking call to clEnqueueReadBuffer() only 'blocks' itself, but cannot make other tasks, which are already in the command queue, blocking. So is there any possibility that the ReadBuffer operation starts before all the other tasks are finished?
    When the application calls clCreateCommandQueue() it selects whether the queue will have in-order or out-of-order execution. By default queues have in-order execution, which means that if you enqueue a command A and later a command B it is guaranteed that B will not start running until A is finished. This is explained in the glossary and section 5.1.

    In other words, enqueuing a sequence of commands and ending with a blocking ReadBuffer will behave in a sensible way.
    Disclaimer: Employee of Qualcomm Canada. Any opinions expressed here are personal and do not necessarily reflect the views of my employer. LinkedIn profile.

  9. #9

    Re: Wondering when I should use clFlush or clFinish.

    Quote Originally Posted by david.garcia
    It seems to me that a blocking call to clEnqueueReadBuffer() only 'blocks' itself, but cannot make other tasks, which are already in the command queue, blocking. So is there any possibility that the ReadBuffer operation starts before all the other tasks are finished?
    When the application calls clCreateCommandQueue() it selects whether the queue will have in-order or out-of-order execution. By default queues have in-order execution, which means that if you enqueue a command A and later a command B it is guaranteed that B will not start running until A is finished. This is explained in the glossary and section 5.1.

    In other words, enqueuing a sequence of commands and ending with a blocking ReadBuffer will behave in a sensible way.

    Fair enough. This answers exactly what confused me. Thanks.

  10. #10
    Junior Member
    Join Date
    May 2012
    Posts
    1

    Re: Wondering when I should use clFlush or clFinish.

    Short question, say I wanna call multiple times the same kernel for whatever reason, do I have to call clFinish in between enqueueNdRange ? ... or do I have the guarantee that the previous kernel will have finished before executing the next one (here , the same kernel) ?

Similar Threads

  1. clflush or clfinish
    By jai in forum OpenCL
    Replies: 1
    Last Post: 01-23-2013, 01:08 AM
  2. problem with clFinish
    By gerstla in forum OpenCL
    Replies: 2
    Last Post: 09-19-2011, 03:21 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •