Search:

Type: Posts; User: MaximS

Page 1 of 2 1 2

Search: Search took 0.00 seconds.

  1. Replies
    6
    Views
    1,037

    Can you shrink your code to the simplest case...

    Can you shrink your code to the simplest case where the error still pops up? This will help us and especially you to find the source of the error much quicker.
  2. This is how I did it: #define nx ...

    This is how I did it:



    #define nx get_global_id(0)
    #define ny get_global_id(1)
    #define nz get_global_id(2)
    #define Nx get_global_size(0)...
  3. How to avoid unnecessary memory copying if using a CPU as device?

    I'm planing a program which should be runnable on GPU and CPU. The time critical operation is a large FFT (up to 1 million data points). I want the program to recognize which device type is currently...
  4. An update to newest driver version didn't help....

    An update to newest driver version didn't help. I'm experiencing the same problem on another machine with a GTX 285.
    Probably Nvidia is neglecting OpenCL support for older cards. That's sad ...
  5. Example code

    I've added example code which reproduces the behavior. Can somebody try the code on a GTX 480 please?
  6. clBuildProgram returns CL_INVALID_BINARY for double data types on a GTX 480

    Today I tried to compile my program on a CUDA machine:

    * platform: NVIDIA CUDA [0]
    * version: OpenCL 1.1 CUDA 4.2.1
    * device: GeForce GTX...
  7. Replies
    1
    Views
    1,177

    What is vstoren for?

    Are there any advantages compared to the usual way storing variables?
  8. Replies
    2
    Views
    1,627

    Re: fast buffer initialization with zeros

    Thanks for the answer. Option a is the best I think.
  9. Replies
    2
    Views
    1,627

    fast buffer initialization with zeros

    What is the fastest way to initialize a buffer with zeros? I need a buffer which is 1/8 filled with data and 7/8 with zeros. The data is a copy from another buffer. The buffers can be several 100MB...
  10. Re: Pass scalars to kernel and avoid creating CLBuffer each

    Ok, I found a solution and it didn't speed up anything. :)

    Anyway: To pass only a variable instead of a whole array when using pyopencl one has to use the numpy datatypes. Example:


    __kernel...
  11. Pass scalars to kernel and avoid creating CLBuffer each time

    I have to execute a kernel many times where the only change is that two buffer are exchanged and one is updated before the call: For example the first call would be


    myContext.program.myKernel(
    ...
  12. Intel CPU performs better with AMDAPP than Intel OpenCL SDK

    Hi all.

    I just compared the Intel implementation against that from AMD and it turned out that my code runs faster with the AMDAPP SDK, even though I have Intel CPUs (PC and Laptop). The Intel...
  13. Re: A simple technical question about memory access.

    You're right. It doesn't work like I thought. Thanks.
  14. Re: A simple technical question about memory access.

    Actually it does. I thought OpenCL takes care of synchronization in such cases. Am I wrong?
  15. A simple technical question about memory access.

    Hi all,

    let's assume the following kernel:



    #define nx (signed)get_global_id(0)
    #define ny (signed)get_global_id(1)
    #define Nx (signed)get_global_size(0)
    #define Ny ...
  16. Replies
    20
    Views
    3,963

    Re: How to synchronize iterations?

    Hm, okay. But have I understood correctly, that inside a kernel a work group of not more than 1024x1024x64 (this is MAX_WORK_ITEM_SIZES) work items can be synchronized?

    edit: The code is not...
  17. Replies
    20
    Views
    3,963

    Re: How to synchronize iterations?

    Well, I was wroing. Actually I do use the if statement three times. But I think that I use it in a proper way.



    #define Nx 0
    #define Ny 1
    #define Nz 2

    __kernel void...
  18. Replies
    20
    Views
    3,963

    Re: How to synchronize iterations?

    Thanks for the link. In the current kernel version I don't use if, switch, do, for or while at all.
  19. Replies
    20
    Views
    3,963

    Re: How to synchronize iterations?

    Yes, I have. However, the code is not finished and it may be true that further changes will affect the problem in such a way, that the overhead becomes negligible or even more significant.

    What...
  20. Replies
    20
    Views
    3,963

    Re: How to synchronize iterations?

    Not especially. Just minimise the amount of data transfer you do at each iteration and do as few kernel calls as possible :wink:[/quote]
    Yeah, I was expecting an answer like this. ;)
    Is anybody...
  21. Replies
    20
    Views
    3,963

    Re: How to synchronize iterations?

    Is there a way to minimize the overhead when calling the same kernel many times?
  22. Replies
    20
    Views
    3,963

    Re: How to synchronize iterations?

    Message deleted.
  23. Replies
    20
    Views
    3,963

    Re: How to synchronize iterations?

    How much overhead is there if I move the iteration loop out from the kernel and put the kernel calls in a host side itration loop? I tried this and the performance dropped that much it makes no...
  24. Replies
    20
    Views
    3,963

    Re: How to synchronize iterations?

    OK, I've read section 3. Now I have a question about the synchronization. Currently I'm using AMD APP SDK and a Intel Core 2 but in near future I will switch to a Nvidia GTX 560. The device info...
  25. Replies
    20
    Views
    3,963

    Re: How to synchronize iterations?

    Thanks a lot for the hints!
Results 1 to 25 of 41
Page 1 of 2 1 2