Search:

Type: Posts; User: utnapishtim

Page 1 of 5 1 2 3 4

Search: Search took 0.00 seconds.

  1. Replies
    2
    Views
    504

    You are using unnormalized integer coordinates...

    You are using unnormalized integer coordinates with read_imagef(), so your sampler should be

    const sampler_t sampler = CLK_NORMALIZED_COORDS_FALSE |
    ...
  2. I've checked on NVIDIA GPU, AMD GPU and Intel CPU...

    I've checked on NVIDIA GPU, AMD GPU and Intel CPU and your kernel is fine.

    How do you get the result from the device buffer on the host side?
  3. Try to cast plain to unsigned int instead of...

    Try to cast plain to unsigned int instead of unsigned char, such as:

    W[t] = ((unsigned int) plain[t * 4]) << 24;

    and so on...
  4. Replies
    6
    Views
    744

    The host buffer is not necessarily up-to-date...

    The host buffer is not necessarily up-to-date when your kernel ends because its content can be cached in device memory.

    You have to use clEnqueueMapBuffer / clEnqueueUnmapBuffer to ensure that the...
  5. Replies
    3
    Views
    810

    Check whether the extension is present in the...

    Check whether the extension is present in the string returned by clGetDeviceInfo() with CL_DEVICE_EXTENSIONS.
  6. Replies
    3
    Views
    810

    Are you sure that your device has support for the...

    Are you sure that your device has support for the cl_khr_3d_image_writes extension?

    Also use clGetProgramBuildInfo() with CL_PROGRAM_BUILD_LOG to get more info about the reason why the build...
  7. Your kernels could be optimized, but the most...

    Your kernels could be optimized, but the most important parameter when using a GPU is the local work size.

    NVIDIA GPUs for instance are optimized for a local work size of 128, so you should try...
  8. Replies
    7
    Views
    804

    CL_MEM_READ_WRITE flag will create a buffer in...

    CL_MEM_READ_WRITE flag will create a buffer in device memory. CL_MEM_HOST_NO_ACCESS is just an optional hint.
  9. Replies
    7
    Views
    804

    Just use clCreateBuffer() with CL_MEM_READ_WRITE...

    Just use clCreateBuffer() with CL_MEM_READ_WRITE flag. You can also add the hint flag CL_MEM_HOST_NO_ACCESS if your device has support for OpenCL 1.2.
  10. Replies
    5
    Views
    583

    Note that buffers use the endianness of the...

    Note that buffers use the endianness of the device, so a buffer should be read or written taking this into account.

    You can change this behavior with __attribute__((endian(host))) to declare that...
  11. Replies
    32
    Views
    14,661

    Sticky: Illegal cast in Appendix B - Portability

    The example at the bottom of page 363 in appendix B uses illegal casts:



    float4 v = vload4( 0, x );
    uint4 y = (uint4) v; // legal, portable
    ushort8 z = (ushort8) v; // legal, not portable

    ...
  12. Replies
    3
    Views
    704

    You have to install an OpenCL driver for a...

    You have to install an OpenCL driver for a supported device. Since you have an Intel CPU but no GPU, you should install the Intel OpenCL driver instead.
  13. Replies
    5
    Views
    583

    One of the job of the OpenCL runtime is to...

    One of the job of the OpenCL runtime is to marshal data between host and device transparently. Alignment and packing are defined in the OpenCL specification and are compatible with standard C usage...
  14. Replies
    7
    Views
    804

    "Recent GPU" probably means less than 10-year old...

    "Recent GPU" probably means less than 10-year old here...

    Gather means that the GPU can do random-access loads, while scatter means that the GPU can do random-access stores.

    It dates from the...
  15. You should read the section "3.1.1 Platform Mixed...

    You should read the section "3.1.1 Platform Mixed Version Support" in the OpenCL Specification.

    1. There are three kinds of version:

    - Platform version: this gives the version of the OpenCL API...
  16. Your kernel and the global and local ranges look...

    Your kernel and the global and local ranges look fine.

    What do you mean by "not working"? Does OpenCL return an error, or are the results numerically different from expected?
  17. Replies
    5
    Views
    583

    If you have only one instance of the struct to...

    If you have only one instance of the struct to send to the kernel, you can pass it by copy:

    __kernel void ker(..., struct Params test)
  18. If the buffer is allocated in device memory,...

    If the buffer is allocated in device memory, there is no such thing as a pointer to host memory.

    Even when the buffer is allocated in host memory, if it resides in pinned memory, a call to...
  19. The constructor of Buffer allocates memory....

    The constructor of Buffer allocates memory. enqueueMapBuffer doesn't.

    enqueueMapBuffer maps the buffer object into host memory. Calling it several times will not allocate several buffers.
    ...
  20. Replies
    10
    Views
    1,097

    In Java, a char is 2-byte long, so you should...

    In Java, a char is 2-byte long, so you should replace cl_char by cl_ushort in your Java code, and char by ushort in your kernel.

    Furthermore, the barrier is unnecessary in your kernel.
    ...
  21. Replies
    10
    Views
    1,097

    To fully understand your problem, you should...

    To fully understand your problem, you should explain:

    - how you create your output buffer
    - how you read data from your output buffer after the execution of the kernel
    - how you define...
  22. Replies
    10
    Views
    1,097

    You should give more information, such as the way...

    You should give more information, such as the way you allocate and fill your buffers, and the work sizes of clEnqueueNDRangeKernel().
  23. Replies
    4
    Views
    735

    This is quite a complex matter. An NVIDIA...

    This is quite a complex matter.

    An NVIDIA multiprocessor is the hardware unit which corresponds to an OpenCL compute unit. Each multiprocessor can independently run concurrent threads.

    In...
  24. Replies
    10
    Views
    1,087

    Note that to take advantage of "zero-copy...

    Note that to take advantage of "zero-copy sharing" with CL_MEM_USE_HOST_PTR on integrated GPU devices, you generally have to use pointers aligned to CL_DEVICE_MEM_BASE_ADDR_ALIGN.
  25. Replies
    10
    Views
    1,087

    It depends on the latitude you have with...

    It depends on the latitude you have with pointers. If you write a library, you may have to accept pointers to host memory as arguments. Then CL_MEM_USE_HOST_PTR could be the only choice you have.
    ...
Results 1 to 25 of 106
Page 1 of 5 1 2 3 4