Results 1 to 5 of 5

Thread: OpenCL Ndrange Global Size/Local Size

  1. #1
    Junior Member
    Join Date
    May 2012
    Posts
    5

    OpenCL Ndrange Global Size/Local Size

    Hello,

    As far as i understand, ndrange global size should be a multiple of local size.
    But in case it's not, how does OpenCL handle it? (better said, does OCL handle it?).

    I mean, how many "groups" of size = local_size will be launched.

    For example, which one would be right for global size 1000 and local_size 512?:
    a) We'll have 2 groups of size 512.
    b) We'll have 1 group of size 512.
    c) we'll have 1 group of size 512 and a group of size 488.

    And... it's strictly defined by the standard? or it's implementation dependant.

    Thanks!.

  2. #2
    Junior Member
    Join Date
    May 2012
    Posts
    5

    Re: OpenCL Ndrange Global Size/Local Size

    Well, on AMD implementation it looks like the kernel wont even launch so i think that answers my question :/

  3. #3
    Senior Member
    Join Date
    Oct 2012
    Posts
    107

    Re: OpenCL Ndrange Global Size/Local Size

    According to OpenCL specification, clEnqueueNDRangeKernel should fail and return CL_INVALID_WORK_GROUP_SIZE.

  4. #4
    Senior Member
    Join Date
    Dec 2011
    Posts
    160

    Re: OpenCL Ndrange Global Size/Local Size

    It is defined by the standard. You must make it a multiple.

    The standard way of dealing with non-multiple desired global work sizes is to use the rounded-up value for clEnqueueNDRangeKernel, but pass the desired global size as kernel parameters, then check for global ID inside the kernel to see if it is inside the desired work size. For example, to process a 1920x1080 image with a 32x32 local work size. Global work size must be 1920x1088. The kernel might look like:

    Code :
    __kernel void Example_Kernel
    (
        __read_only   image2d_t imgSrc,
        __write_only  image2d_t imgDst,
        int       width,
        int       height
    )
    {
        int x = get_global_id(0);
        int y = get_global_id(1);
     
        if ((x < width) && (y < height))
        {
        ... // do work here
        }
    }
    For getting started, you can leave local work size unspecified, and let the runtime come up with one, but if you have odd or prime global sizes, it might use 1x1 which is not optimal.

  5. #5
    Junior Member
    Join Date
    May 2012
    Posts
    5

    Re: OpenCL Ndrange Global Size/Local Size

    Yeah thanks, did so .

    I'm working on a opencl "middleware", so needed to know every possible combination, but it looks than rounding-up works fine. That's good for me i think, after all it's the same approach than cuda .

Similar Threads

  1. global & local size in 2D problem
    By pelliegia in forum OpenCL
    Replies: 2
    Last Post: 10-20-2012, 03:09 PM
  2. Local work size!
    By Atmapuri in forum OpenCL
    Replies: 6
    Last Post: 05-21-2011, 08:15 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •