Search:

Type: Posts; User: linyufly

Page 1 of 2 1 2

Search: Search took 0.00 seconds.

  1. Re: Is float point operation in OpenCL stochastic?

    I see. Thanks!



    For example, if you add a random sequence:
    2^-30, 2^30, 2^-30, 2^30 ...
    you'll end up getting the same as:
    0, 2^30, 0, 2^30 ...

    because the mantissa of a float isn't big...
  2. Re: Is float point operation in OpenCL stochastic?

    Thanks LeeHowes!

    However, why does the order between work items influence the operation result?

    Thanks!
  3. Re: Is float point operation in OpenCL stochastic?

    Hi notzed,

    Is it possible that one processor in my card malfunctions?

    Thanks!
  4. Re: Is float point operation in OpenCL stochastic?

    Thanks Notzed. What is boundary over-runs?

    Best regards,
    Mingcheng
  5. Is float point operation in OpenCL stochastic?

    Hi guys,

    I have a float point calculation code and sometimes it gives different results than what it gets in most runs.

    Sometimes I use "cl-opt-disable" and sometimes I just use "" for...
  6. Re: When I want to use atomic_add, what should I do?

    Mine is OpenCL 1.1, is it also supposed to include atomics without #pragma?

    Thanks!

    [/quote][/quote]
  7. Re: When I want to use atomic_add, what should I do?

    I was confused because even if I use #pragma OPENCL EXTENSION cl_khr_int32_base_atomics : disable or #pragma OPENCL EXTENSION asdfsadfads : require it still compile.... And I seem to receive some...
  8. When I want to use atomic_add, what should I do?

    Hi guys,

    (1) I should declare the array to be volatile, right?
    (2) I should put #pragma OPENCL EXTENSION cl_khr_global_int32_base_atomics : enable in the top of the *.cl file, right?

    What else...
  9. A very confusing problem of kernels w\ "-cl-opt-disable"

    Hi guys,

    That problem has tortured me for months.

    My kernel has a lot of floating-point calculations and it should give the same answer all the time. However, about 2 out of 10 runs will give...
  10. Replies
    6
    Views
    2,314

    Re: Do I need atomic_add in that case?

    Hi ofer_rose,

    Thanks!

    I should have directly described my application. There are several properties, 0..N-1 and M objects. I have an int array for properties, A[0..N-1] and I use one thread for...
  11. Replies
    6
    Views
    2,314

    Re: Do I need atomic_add in that case?

    No. you have to use some atomic mechanism to sync the reads and writes to histo[].

    The operation of "histo[arr[id]]++;" (or any increment of global memory) is actually composed of three...
  12. Replies
    6
    Views
    2,314

    Re: Do I need atomic_add in that case?

    Thanks ibbles!

    Is it correct not to use atomic_add in this case?
  13. Replies
    6
    Views
    2,314

    Do I need atomic_add in that case?

    Hi,

    For example, given an integer array where every element is within [0..255], I want to get the number of repetitions of each value in [0..255].

    __kernel void Histogram(__global int *arr,...
  14. Re: Is there any OpenCL library having Prefix Sum?

    I am learning it right now.

    In what order do I call those kernels?

    Thanks again!
  15. Re: Is there any OpenCL library having Prefix Sum?

    Oh, thank you very much!
  16. Is there any OpenCL library having Prefix Sum?

    Hi,

    I need a Prefix Sum implementation. If I implement it myself, I cannot make sure it is the best implementation so far.

    Is there any OpenCL library for that?

    Thanks!
  17. Re: How much waste can the warp divergence bring?

    Thanks notzed!
  18. Re: How much waste can the warp divergence bring?

    Thanks krocki!

    Do you think those imposed operations are actually executed or they just skip over quickly? For example, if it is a read operation, does a memory read really take place?

    Thanks...
  19. How much waste can the warp divergence bring?

    Hi,

    I know that if within one warp the threads have different branches or number of loops, all the branches or the maximum number of loops would be executed for all of them.

    However, I am...
  20. Replies
    6
    Views
    1,571

    Re: I cannot use 16K shared memory in GTX9800

    Oh I see. So that is why I cannot use all of the 16K memory for my local memory?

    Thanks!
  21. Replies
    6
    Views
    1,571

    Re: I cannot use 16K shared memory in GTX9800

    Thanks a lot!
  22. Replies
    6
    Views
    1,571

    Re: I cannot use 16K shared memory in GTX9800

    Oh really?!

    Is it true that when my shared memory usage is close to the limit, the execution will be slower because the L2 cache has only a little memory left?

    Usually how many percentages of...
  23. Replies
    6
    Views
    1,571

    I cannot use 16K shared memory in GTX9800

    Hi,

    It seems that I can only use 15K in such a 1.0 architecture.

    Is it normal or something wrong in my code?

    Thanks!
  24. Replies
    6
    Views
    3,066

    Re: clEnqueueReadBuffer blocking always

    Hi Asgard,

    Thank you very much for your so detailed explanation!

    I have observed the same result as you have.

    Thanks again!

    Best regards,
    Mingcheng Chen
  25. Replies
    6
    Views
    3,066

    Re: clEnqueueReadBuffer blocking always

    Hi Asgard,



    I have a silly question. Please do not mind if it is too silly.

    In your code you only enqueue commands, at which point do you start issuing them?

    Why don't you need clFlush()...
Results 1 to 25 of 30
Page 1 of 2 1 2