Results 1 to 2 of 2

Thread: How to know if the kernels are executing concurrently?

  1. #1
    Junior Member
    Join Date
    Jul 2012
    Posts
    5

    How to know if the kernels are executing concurrently?

    I have a NVIDIA GPU with Compute Capabiity 3.0, so it should support 16 concurrent kernels. I am starting 10 kernels by looping through clEnqueueNDRangeKernel for 10 times. Each of the kernel is tied to a different command queue. How do I get to know that the kernels are executing concurrently?

    One way which I have thought is to get the time before and after the NDRangeKernel statement. I might have to use events so as to ensure the execution of the kernel has completed. But I still feel that the loop will start the kernels sequentially. Can someone tell me if this is the right way to start concurrent kernels..?

    Also what if I start more than 16 kernels (say 20), will the kernels be executed in a batch of 16 kernels i.e. first 16 will be executed in first batch and then remaining 4 kernels in the next batch..?

  2. #2
    Senior Member
    Join Date
    Aug 2011
    Posts
    271

    Re: How to know if the kernels are executing concurrently?

    Quote Originally Posted by nikhilk
    I have a NVIDIA GPU with Compute Capabiity 3.0, so it should support 16 concurrent kernels. I am starting 10 kernels by looping through clEnqueueNDRangeKernel for 10 times. Each of the kernel is tied to a different command queue. How do I get to know that the kernels are executing concurrently?

    One way which I have thought is to get the time before and after the NDRangeKernel statement. I might have to use events so as to ensure the execution of the kernel has completed. But I still feel that the loop will start the kernels sequentially. Can someone tell me if this is the right way to start concurrent kernels..?

    Also what if I start more than 16 kernels (say 20), will the kernels be executed in a batch of 16 kernels i.e. first 16 will be executed in first batch and then remaining 4 kernels in the next batch..?
    The start/end time of execution from the event information should show this as they are all from the same reference. The profiler should show concurrent execution, and that's easier than adding manual timing code. And if nothing else, the total execution time should be better.

    You'd have to refer to the nvidia docs on how it manages lots of jobs (if they deem that important enough to include), the obvious choice would be to just run them in a FIFO manner, but that is only a guess.

Similar Threads

  1. busy wait when executing kernel
    By sanderbeckers in forum OpenCL
    Replies: 4
    Last Post: 08-19-2011, 02:38 AM
  2. malloc/free and executing sub kernels
    By AdrianLyons in forum Suggestions for next release
    Replies: 1
    Last Post: 08-01-2010, 07:17 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •