Results 1 to 3 of 3

Thread: What is faster?

Hybrid View

  1. #1
    Newbie
    Join Date
    Mar 2014
    Posts
    3

    What is faster?

    Hi OpenCL community,

    I want to ask your opinion about what is faster?

    context: image filtering (convolution)

    load from constant global variable (image vector) to private
    process data in private memory
    write in global

    or

    load from constant global variable to local memory
    barrier to wait synchronization of local memory
    process data from local
    write result in global

    I know that loading from global should be much slower and that I am loading the same data over and over in every work item, but the process is done in private which is much faster. In the other hand, I don't know if waiting for the barrier can affect my performance and I also ignore a ratio (roughly) between the read/write speeds of global and local.

    I will appreciate if anyone can answer.

    Thanks

    LC

  2. #2

    manual caching

    The second option is what I call "manual caching". So when the accelerator has no cache, the second one runs faster. When the GPU is enough cache, then it won't really matter. Not tested in a while, so not sure anymore: when using the CPU, the second version runs slower.

    In most cases I found other reasons to be more important to use local mem.

  3. #3
    Newbie
    Join Date
    Mar 2014
    Posts
    3
    Quote Originally Posted by VincentH View Post
    The second option is what I call "manual caching". So when the accelerator has no cache, the second one runs faster. When the GPU is enough cache, then it won't really matter. Not tested in a while, so not sure anymore: when using the CPU, the second version runs slower.

    In most cases I found other reasons to be more important to use local mem.
    Thanks VincentH

    LC

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •