Page 2 of 2 FirstFirst 12
Results 11 to 12 of 12

Thread: First local memory work

  1. #11
    Senior Member
    Join Date
    May 2010
    Location
    Toronto, Canada
    Posts
    845

    Re: First local memory work

    I guess that is what I should use for define work-groups size? Is it the only way, or the right way?
    __attribute__((reqd_work_group_size(X, Y, Z))) is basically a hint for the compiler. It doesn't hurt to have it but at the end of the day, what determines the work-group size is the value you pass to clEnqueueNDRangeKernel().

    Notice that if you use the attribute above, it clEnqueueNDRangeKernel will return an error if the group size you pass to clEnqueueNDRangeKernel does not match the value in the attribute. In that sense it's more than a hint.

    If I assign local work size, does work-group size define from this?
    For example, in this
    Sorry, I didn't understand the question.

    How work-group (or work-group size) are assigned?
    If the question is "how do I choose the right work-group size?", the answer is trial and error. There's no easy rule that will give you what is the best size for performance.
    Disclaimer: Employee of Qualcomm Canada. Any opinions expressed here are personal and do not necessarily reflect the views of my employer. LinkedIn profile.

  2. #12
    Junior Member
    Join Date
    May 2011
    Posts
    4

    Re: First local memory work

    Quote Originally Posted by wrx
    clEnqueueNDRangeKernel :
    ....
    The work-group size to be used for kernel can also be specified in the program source using the __attribute__((reqd_work_group_size(X, Y, Z)))qualifier. In this case the size of work group specified by local_work_size must match the value specified by the reqd_work_group_size __attribute__ qualifier.
    ...
    -> I guess that is what I should use for define work-groups size? Is it the only way, or the right way?
    depends. the attribute may be used if you *require* a specific work size (e.g. have hardcoded array sizes, your kernel may only operate on a specific local work size, ...), however this shouldn't be the case for normal kernels.
    your work group size is either that you passed to clEnqueueNDRangeKernel as local work size or some value "guessed" by the driver if you didn't specify any.
    note that if you use the reqd_work_group_size attribute it's an error to specify a different one for the kernel

    Quote Originally Posted by wrx
    clEnqueueNDRangeKernel :
    -> If I assign local work size, does work-group size define from this?
    For example, in this :
    Code :
    	const size_t global_work_size[1] = {1024};
    	const size_t local_work_size[1] = {256};
    -> How work-group (or work-group size) are assigned?
    yes, the parameter you specified as local work size is the work-group size.

    something you may also want to have a look at is __local arguments to kernels. from clSetKernelArg (arg_size):
    For arguments declared with the __local qualifier, the size specified will be the size in bytes of the buffer that must be allocated for the __local argument.
    e.g. if your buffer requires 1 float per work item, you'd set arg_size in clSetKernelArg to 256*sizeof(cl_float) if you have 256 as local size and use "__local float*" as argument type in your kernel

    and finally for copying data to/from global memory you may want to have a look at async_work_group_copy/async_work_group_strided_copy and wait_group_events as it eases that copy a bit.

Page 2 of 2 FirstFirst 12

Similar Threads

  1. Replies: 1
    Last Post: 11-18-2011, 10:05 AM
  2. Local work size!
    By Atmapuri in forum OpenCL
    Replies: 6
    Last Post: 05-21-2011, 08:15 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •