Page 2 of 2 FirstFirst 12
Results 11 to 12 of 12

Thread: Sharing host memory with clSetKernelArg!

  1. #11
    Junior Member
    Join Date
    May 2011
    Posts
    24

    Re: Sharing host memory with clSetKernelArg!

    1.) Using only create/free buffer:

    platform[0]=AMD Accelerated Parallel Processing
    device[0]=Juniper
    end-start time 15.666632 usec

    device[1]=Intel(R) Core(TM) i7 CPU 860 @ 2.80GHz
    end-start time 4.736423 usec

    platform[1]=NVIDIA CUDA
    device[0]=GeForce 8600 GT
    end-start time 8.015486 usec

    platform[2]=Intel(R) OpenCL
    device[0]=Intel(R) Core(TM) i7 CPU 860 @ 2.80GHz
    end-start time 24.046458 usec

    2.) Create/Free buffer and que:

    platform[0]=AMD Accelerated Parallel Processing
    device[0]=Juniper
    end-start time 12.023229 usec

    device[1]=Intel(R) Core(TM) i7 CPU 860 @ 2.80GHz
    end-start time 10.201528 usec

    platform[1]=NVIDIA CUDA
    device[0]=GeForce 8600 GT
    end-start time 13.844930 usec

    platform[2]=Intel(R) OpenCL
    device[0]=Intel(R) Core(TM) i7 CPU 860 @ 2.80GHz
    end-start time 23.317777 usec

    3.) Create/Free buffer, que and do map/umap:
    platform[0]=AMD Accelerated Parallel Processing
    device[0]=Juniper
    end-start time 740.339427 usec

    device[1]=Intel(R) Core(TM) i7 CPU 860 @ 2.80GHz
    end-start time 22.224756 usec

    platform[1]=NVIDIA CUDA
    device[0]=GeForce 8600 GT
    end-start time 9735.171989 usec

    platform[2]=Intel(R) OpenCL
    device[0]=Intel(R) Core(TM) i7 CPU 860 @ 2.80GHz
    end-start time 101.650935 usec

    I moved the malloc outside of the loop and added 128byte allignment. I found also some timing overhead in my own code thanks to your example. AMD does show a relatively low overhead for the CPU device, but is still a lot more than pointer copy or a call to clSetKernelArg. Anyhow, you did say that using the pointer without map/unmap for CPU device is fine. So I guess that solves the (overhead) problem. Usually when you copy data you need to wait for the queue to stop anyway.

    Thanks!
    Atmapuri

  2. #12

    Re: Sharing host memory with clSetKernelArg!

    Anyhow, you did say that using the pointer without map/unmap for CPU device is fine.
    Actually I didn't say that. But good luck in your efforts!

Page 2 of 2 FirstFirst 12

Similar Threads

  1. copying a variable from host memory to device memory
    By shahsaurabh1990 in forum OpenCL
    Replies: 4
    Last Post: 03-26-2013, 01:10 AM
  2. Replies: 1
    Last Post: 05-03-2011, 05:08 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •