Search:

Type: Posts; User: boxerab

Search: Search took 0.00 seconds.

  1. Replies
    0
    Views
    813

    Bank conflicts in 2D kernel

    Suppose our hardware has 32 banks of 4 byte width. And we have a 1D kernel
    of size 32, and a local 1D array of ints.

    Then, ensuring that each consecutive thread accesses consecutive
    memory...
  2. Thread: GPU vs FPGA

    by boxerab
    Replies
    3
    Views
    1,567

    Thanks, Dithermaster. One interesting...

    Thanks, Dithermaster. One interesting development: Intel is planning a Xeon chip with integrated FPGA. Should be interesting.
  3. How to optimize kernel with mixture of parallel and serial code ?

    I have a kernel that performs two tasks (A followed by B) - the first is quite parallel, and the second task cannot be parallelized.

    Task A is performed by all work items, and task B is only...
  4. Thread: GPU vs FPGA

    by boxerab
    Replies
    3
    Views
    1,567

    GPU vs FPGA

    So far, I have only been thinking of GPU platforms when developing my kernel,
    But, I just learned that the two largest FPGA manufacturers, Xilinx and Altera,
    now have OpenCL SDKs.

    Can anyone...
  5. Thanks, Dithermaster. Makes sense.

    Thanks, Dithermaster. Makes sense.
  6. Impact of PCI bus speed on opencl performance

    PCIe 4 is expected in 2016. Can anyone comment on the impact this will have
    on gpgpu performance? For gaming, I have read that pci 3 has about same perf
    as pci 2.
    Thanks.
  7. Tried this out on HD 7700 series GPU: best perf...

    Tried this out on HD 7700 series GPU: best perf was from individual loads, not vloadn.
  8. Thanks kunze. Now, what about bank conflicts. If...

    Thanks kunze. Now, what about bank conflicts. If work item one issues memory reads from address 0 to address 4, and
    the next work item reads from address 1 to address 5, then the individual reads...
  9. vload4 vs four buffer acceses for local memoy buffer

    Does vload4 have any advantage over four individual buffer accesses for a local memory buffer?

    i.e

    ////////////////////////////////////////////////////////////
    __local int FOO[256];

    // case...
  10. Replies
    25
    Views
    9,627

    Sticky: Would like to target system with intel cpu and amd gpu

    Hello List,
    I would like to be able to load-balance my algorithm onto both intel cpu and amd gpu
    at the same time.

    Now, Intel SDK supports intel hardware, and AMD SDK supports AMD hardware.
    ...
Results 1 to 10 of 10