Search:

Type: Posts; User: neFAST

Search: Search took 0.00 seconds.

  1. Re: Optimisation tips for fetch intensive kernel on ATI

    [/quote]
    My 7850 (Pitcairn) is using GCN, right?
  2. Re: Optimisation tips for fetch intensive kernel on ATI

    I run a few different work size and take the best score.
    http://i.imgur.com/PMbxY.png
  3. Re: Optimisation tips for fetch intensive kernel on ATI

    I packed my fetch by 8 and saw a +50% increase in performance.
    However if I pack 12 or 16 of them, it's slower.

    for( int l=0; l<loclsize/2; l+=8 )
    {
    ...
  4. Re: Optimisation tips for fetch intensive kernel on ATI

    Here is the result of the profiler.
    I can't get the profiler to output occupancy. This option is checked, but the collumn is not present in the result table!

    http://i.imgur.com/ZW83v.png
  5. Re: Optimisation tips for fetch intensive kernel on ATI

    Thanks for your answer.
    I will definitely run the profiler asap.
    I'm still compiling with CUDA SDK, do I need to compile with AMD if I want to use the profiler?

    Regarding fetch/ALU switches, is...
  6. Re: Optimisation tips for fetch intensive kernel on ATI

    Thanks for your answer.
    So what would be your explanation in terms of hardware difference? Are the NVidia cards taking less cycles per global read? Or is it the cache system that is better?
  7. Optimisation tips for fetch intensive kernel on ATI

    Dear OpenCL users, I recently ported a kernel from CUDA to OpenCL.
    This kernel process a 2D image (~512) and for each pixel, fetch ~8000 coordinates in global memory.
    Then for each pixel it will...
Results 1 to 7 of 8