Results 1 to 2 of 2

Thread: Cl_device_preferred_vector_width_float = 1

  1. #1
    Junior Member
    Join Date
    Oct 2013
    Posts
    22

    Cl_device_preferred_vector_width_float = 1

    Hello,

    I ran:
    clGetDeviceInfo(dev /* device */,
    CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT /* param_name */,
    sizeof(cl_uint) /* param_value_size */,
    &float_size /* param_value */,
    &param_size /* param_value_size_ret */);

    On Intel's CoreI7-Ivy Bridge and got float_size=1.

    Does it make sense ?
    I know for sure that the CPU in the chip can load 4 X 32bits words (e.g float) at one clock.
    Is it possible that the GPU can not do it ?

    Thanks,
    Zvika

  2. #2
    The Intel OpenCL C compiler tries to vectorise your kernel, that's why the preferred vector size is always 1. If I remember one of their webinars correctly, the compiler will try to group work items along dimension 0 of the N-D range used to launch the kernel. The build log generated when building the kernel will tell you if it was actually vectorised or not. You can use the Intel Kernel Builder to build your kernels offline and examine the build log easily.

    As a side note, AMD's CPU runtime on a Sandy Bridge Core i7 reports a preferred vector width of 4 of float, so it probably doesn't automatically vectorise kernels.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •