Results 1 to 4 of 4

Thread: 3-D vs 1-D Global worksapce

  1. #1
    Junior Member
    Join Date
    Dec 2012
    Posts
    11

    3-D vs 1-D Global worksapce

    This is yet another query about global workspace.

    Basically, i'm getting a graphics card crash, when i specify a large number of threads in the global workspace.
    When i specify a 1-D global workspace, i have noticed that in my case, i can specify a number up to 2^32. Doing so, the kernel runs absolutely fine. It takes a while to get through the number of threads, but all is good. If i do (2^32) +1 threads, then i get a crash (which i would expect).

    My problem is when i have a 3-D global workspace, where the number of threads is still very large, but is in fact quite a bit less than 2^32.
    For example, specifying workspace as: global(837, 1098, 352) and local(1,1,1)
    will cause a crash, with the error of an invalid command queue.
    But, (837 * 1098 * 352) < 2^32.........

    I have tried removing all code from the kernel, and still get a crash when specifying this size of global workspace.

    My max work item sizes is [1024, 1024, 64].
    I have tried using [1050, 1050, 100] and this works fine, but say [1050, 1050, 500] will not.

    Any ideas?

  2. #2
    Senior Member
    Join Date
    Sep 2002
    Location
    Santa Clara
    Posts
    105

    Re: 3-D vs 1-D Global worksapce

    This seems like a bug in the OpenCL implementation you are running on. Suggest you contact the vendor and file a bug.

  3. #3
    Junior Member
    Join Date
    Dec 2012
    Posts
    11

    Re: 3-D vs 1-D Global worksapce

    It did sound like it to me, but i was hoping for it not to be. Thanks for your reply.

    (Using OpenCL 1.1, NVIDIA Quadro 2000, on version 311.15 drivers)

  4. #4
    Senior Member
    Join Date
    Dec 2011
    Posts
    124

    Re: 3-D vs 1-D Global worksapce

    You put the reason right in your message. Your device only accepts maximum dimensions of [1024, 1024, 64], yet you are passing [837, 1098, 352]. Since 352 > 64, you are asking for something the device cannot do.

    Furthermore, you are settings a local work group size of [1,1,1] which means your GPU is mostly idle, running over a quarter million work items on a single GPU core. That's not going to be very fast.

Similar Threads

  1. Any global barrier?
    By Tim82 in forum OpenCL
    Replies: 2
    Last Post: 12-20-2011, 02:01 AM
  2. Global Barriers?
    By guillona in forum OpenCL
    Replies: 2
    Last Post: 02-20-2010, 04:58 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •