Search:

Type: Posts; User: Dithermaster

Page 1 of 8 1 2 3 4

Search: Search took 0.00 seconds.

  1. Replies
    4
    Views
    39

    Is there even a use case for pipes on CPU or GPU...

    Is there even a use case for pipes on CPU or GPU devices (that is more efficient or less code than just using global memory or images between kernels), or do they exist just for FPGA devices?
  2. To clarify, never install OpenCL.dll in the...

    To clarify, never install OpenCL.dll in the system directory. If you install it in your application folder or bundle, that's less of a system-wrecking technical issue. Legally you might not have the...
  3. Do (2). It works great. Never do (4) or (5),...

    Do (2). It works great.

    Never do (4) or (5), only drivers should install the ICD and you could hurt other applications if you do it wrong.
  4. Replies
    5
    Views
    201

    I've heard that pipes benefit OpenCL on FPGA...

    I've heard that pipes benefit OpenCL on FPGA since they fit better into the pipelined hardware nature of those devices. They also seem to have some benefit in dynamic parallelism for...
  5. Replies
    2
    Views
    200

    My understanding is that OpenCL 1.2 is available...

    My understanding is that OpenCL 1.2 is available for Tegra but you have to contact NVIDIA to get it.
  6. I've seen 1024,1,1 only for the Apple CPU device,...

    I've seen 1024,1,1 only for the Apple CPU device, so I agree with your guess that it was that device. Switch to the GPU device for better dimensions.
  7. Replies
    1
    Views
    352

    Because the runtime may choose to run some...

    Because the runtime may choose to run some workgroups to completion before starting others (when the number of workgroups far exceeds the hardware capabilties) there are therefore no global...
  8. On Windows, OpenCL.dll _is_ the ICD, but you...

    On Windows, OpenCL.dll _is_ the ICD, but you still don't want to ship it. It varies by version, for one thing (what if you ship a version 1.2 one, but the vendor driver updated the system ICD...
  9. You do NOT want to ship this DLL with your...

    You do NOT want to ship this DLL with your project. The one installed on the system is the one you want to use. What problem are you trying to solve?
  10. Replies
    2
    Views
    562

    clFlush can certainly block the CPU; it won't...

    clFlush can certainly block the CPU; it won't return until the command queue has completely been flushed to the hardware, and if the hardware queue is full, the CPU will block.

    Except for CL/GL...
  11. I answered on SO (before I saw this).

    I answered on SO (before I saw this).
  12. That kernel looks like it was code-generated, not...

    That kernel looks like it was code-generated, not hand coded. In any case, one source of slowdown is that each work item reads 16 doubles from global memory. While they can be broadcast within each...
  13. The API is designed to be async -- all of the...

    The API is designed to be async -- all of the clEnqueue calls are designed to return quickly. The OpenCL driver uses a separate thread to push work to the GPU. So once you've queued up work to the...
  14. If you have an OpenCL driver for CPU installed...

    If you have an OpenCL driver for CPU installed then CL_DEVICE_TYPE_CPU devices appear, so yes, it is a useful flag to have.

    You might, for example, try for a GPU device, and only if one is not...
  15. Replies
    1
    Views
    690

    clBuildProgram is _required_ regardless of...

    clBuildProgram is _required_ regardless of whether you created the program using clCreateProgramWithSource or clCreateProjectWithBinary. It will be faster with binary sources.
  16. Replies
    3
    Views
    1,909

    OpenCL C is based on C99, so if it is ill-defined...

    OpenCL C is based on C99, so if it is ill-defined in C99, it's ill-defined in OpenCL C.
  17. Replies
    3
    Views
    1,909

    No such limitation. You can do multiple reads and...

    No such limitation. You can do multiple reads and writes to global memory from within a kernel. You should go back and ask your past self what they meant in the comment.
  18. My cursory understanding is that it's up to the...

    My cursory understanding is that it's up to the vendor's driver and how it's implemented. From what I'm reading above, AMD's driver support it. I think NVIDIA Tesla cards run in the non-graphics mode...
  19. > 256 is the work group size and 700 is the...

    > 256 is the work group size and 700 is the global size so it is evenly divisible.
    Um, no it's not. 256 goes into 768 but not 700.
    The common solution is to "round up" the global size to be an...
  20. You have old knowledge. Intel and AMD are both...

    You have old knowledge. Intel and AMD are both shipping OpenCL 2.0 drivers.

    Intel: https://software.intel.com/en-us/articles/opencl-drivers (2014 r2 is OpenCL 2.0)

    AMD:...
  21. Does the device report slightly less local memory...

    Does the device report slightly less local memory for CL_DEVICE_LOCAL_MEM_SIZE when you're running the r340.xx driver? It had better!

    I did notice a while back that some older NVIDIA OpenCL 1.0...
  22. OpenCL 2.0 adds support for images with the...

    OpenCL 2.0 adds support for images with the read_write qualifier. It is not possible in OpenCL 1.x, you'll need to use two different images. Note: It might just be faster that way anyway.
  23. OpenCL 1.x supports 2D and 3D images and OpenCL...

    OpenCL 1.x supports 2D and 3D images and OpenCL 1.2 adds 1D images, and clEnqueueNDRangeKernel supports 1D, 2D, and 3D workgroups. Of course all of these ultimately map to linear memory, so it's just...
  24. Do you have any constants with a decimal point...

    Do you have any constants with a decimal point and no "f" on the end? Those are doubles, and anything they do math with will get promoted to a double.
  25. Replies
    3
    Views
    1,508

    FPGA would be more applicable to a vertical...

    FPGA would be more applicable to a vertical market solution (where FPGAs typically have). OpenCL is now an alternative programming environment that may be more productive than learning other FPGA...
Results 1 to 25 of 198
Page 1 of 8 1 2 3 4