Results 1 to 4 of 4

Thread: help with work items in work groups

  1. #1
    Junior Member
    Join Date
    Sep 2011
    Posts
    2

    help with work items in work groups

    I do not understand the basics of the global_work_size and local_work_size. I have used CUDA and new to OpenCL.

    I have set my work_items to 64, 1, 1 and work groups to 512, 128, 1. If I do the following:

    __kernel void testKernel(__global uint* output, uint x)
    {
    uint4 gid = (uint4)(get_global_id(0), get_global_id(1), get_global_id(2), 1);
    uint width = get_global_size(0);
    uint height = get_global_size(1);
    uint index = gid.x + (gid.y * width) + (gid.z * width * height);
    output[index] = x;
    }
    things work. I was expecting the width to be get_local_size(0) * get_global_size(0);
    I expected gid.x to go from 0 to (64 * 512) like CUDA 64 threads per 512 blocks.
    What are the work items for, just shared memory grouping? Are there really (512 / 64), 128, 1 blocks containg 64 threads?

    I am sure there is a simple answer to my problem. I have searched which leads me to the conclusion my confusion is very basic and if you please do not direct me some place because I have been there and I still do not understand.

    I really do not understand the groups, get_num_groups and get_group_id. The kernel values in them, I do not see the use.

    Thanks

  2. #2
    Senior Member
    Join Date
    May 2010
    Location
    Toronto, Canada
    Posts
    845

    Re: help with work items in work groups

    Have you tried reading section 3.2 of the OpenCL 1.1 specification? I think it answers your questions.
    Disclaimer: Employee of Qualcomm Canada. Any opinions expressed here are personal and do not necessarily reflect the views of my employer. LinkedIn profile.

  3. #3
    Senior Member
    Join Date
    Aug 2011
    Posts
    271

    Re: help with work items in work groups

    Quote Originally Posted by gatodelsol
    I do not understand the basics of the global_work_size and local_work_size. I have used CUDA and new to OpenCL.

    I have set my work_items to 64, 1, 1 and work groups to 512, 128, 1. If I do the following:

    __kernel void testKernel(__global uint* output, uint x)
    {
    uint4 gid = (uint4)(get_global_id(0), get_global_id(1), get_global_id(2), 1);
    uint width = get_global_size(0);
    uint height = get_global_size(1);
    uint index = gid.x + (gid.y * width) + (gid.z * width * height);
    output[index] = x;
    }
    things work. I was expecting the width to be get_local_size(0) * get_global_size(0);
    I expected gid.x to go from 0 to (64 * 512) like CUDA 64 threads per 512 blocks.
    What are the work items for, just shared memory grouping? Are there really (512 / 64), 128, 1 blocks containg 64 threads?

    I am sure there is a simple answer to my problem. I have searched which leads me to the conclusion my confusion is very basic and if you please do not direct me some place because I have been there and I still do not understand.

    I really do not understand the groups, get_num_groups and get_group_id. The kernel values in them, I do not see the use.

    Thanks
    Your query is a little confusing so forgive me if i didn't understand it correctly.

    You state you set the 'work groups' to 512,128,1: but you can't set the work-groups (explicitly), you only set the global work size.

    This is probably the confusion.

    If you set the global work size to 512,128,1 then gid will go from (0,0,0) to (512,128,1) only, unique in every 'thread'.

    And basically get_local_id(0) will return get_global_id(0) mod get_local_size(0) (64).

    So yes, there will be (512/64),128,1 work-groups, and it's for local memory, register allocation, threads, and so on. (I think ...) On nvidia hardware this doesn't map 1:1 to 'blocks', it may have more than 1 work-group in a block if there are enough resources to fit them.

    get_global_size() will be exactly the global sizes passed in: i.e. 512,128,1, and get_local_size() will be exactly the local sizes passed in: i.e. 64,1,1
    get_num_groups(x) will be get get_global_size(x) / get_local_size(x)
    get_group_id(x) will be get_global_id(x) / get_local_size(x)
    etc.

    See section 6.11.1 (Work Item Functions) in the spec.

  4. #4
    Junior Member
    Join Date
    Sep 2011
    Posts
    2

    Re: help with work items in work groups

    Thanks notzed for the reply. I get it now. I needed a simple explanation that was different from the spec and other things I found on the net to get passed my confusion of globals, locals and work groups.

    Thanks

Similar Threads

  1. doubts with work items and groups
    By mustang in forum OpenCL
    Replies: 9
    Last Post: 02-15-2012, 02:58 PM
  2. Sync across work groups
    By akhal in forum OpenCL
    Replies: 2
    Last Post: 08-07-2011, 06:39 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •