Search:

Type: Posts; User: cantallo

Search: Search took 0.00 seconds.

  1. Replies
    1
    Views
    914

    vector component duplication

    Hi,

    I need to make a vector B from vector A but with possibly duplicating some of the elements (i.e. arbitrary sampling and not a permutation)

    does

    float8 A,B;
    B=shuffle(A,(int...
  2. Replies
    3
    Views
    3,551

    Re: CL_OUT_OF_RESOURCES on clEnqueueReadBuffer

    By invalid access do you mean invalid read or invalid write ?

    It seems to me that invalid read in a kernel pose no problem to the kernel itself (terminates correctly, with full profiling info...
  3. Replies
    8
    Views
    4,218

    Re: maximum pinned memory

    For the Linux users,
    ulimit -l
    gives the current per process limit on locked memory (on bash) in kilobytes.
    You may try to increase it by :
    ulimit -S -l 16384
    for example. But that way you are...
  4. Replies
    8
    Views
    4,218

    Re: maximum pinned memory

    Just a comment on pinned memory usage, for my code (which inputs a lot of data) using pinned memory increased the read rate by a ratio of nearly 3.

    However, I soon noticed that if anyone on the...
  5. Re: Huge circular buffer beyong CL_DEVICE_MAX_MEM_ALLOC_SIZE

    Okay code is operational (but for a few rare operating modes and one auxiliary output) and definitively computing the address on the fly is the fastest code (even more, for the row processing...
  6. Re: Huge circular buffer beyong CL_DEVICE_MAX_MEM_ALLOC_SIZE

    Okay, thanks for the 3rd point... __private float* would not have worked and I would have spend days figuring why.... (it is the prototype of horrible pitfall for beginners)

    I will stick to your...
  7. Re: Huge circular buffer beyong CL_DEVICE_MAX_MEM_ALLOC_SIZE

    Yes, perhaps I need to detail the issue of all this,

    in fact I process my data by overlaping rows, for example with my 16 rows

    I run the computation with rows 0 to 15 of my stream,

    then I...
  8. Re: Huge circular buffer beyong CL_DEVICE_MAX_MEM_ALLOC_SIZE

    Thank you for your fast answer !

    I think you understood my problem, I have a few questions though :

    1st:
    The first lines
    a[0]=a0;
    a[1]=a1;
    a[2]=a2;
    a[3]=a3;
  9. Huge circular buffer beyong CL_DEVICE_MAX_MEM_ALLOC_SIZE

    Hi

    I am porting a code to GPU that uses an huge circular buffer (typically 3Gb) organized as rows (of typically 64k).
    In the CPU code it is implemented as an array of pointer each pointing to a...
  10. Re: Multiple host threads with single command queue and devi

    It is simply that I (sometimes) debug my programs at home, and the only openCL capable card there is my wife's GeForce 8400M GS (on an old laptop) for which the SDK I installed is only openCL 1.0.
    ...
  11. Re: Multiple host threads with single command queue and devi

    In openCL 1.0 is it still possible to braket the call to clEnqueueNDRange.../clEnqueBufferWrite... by a single shared mutex lock (thus forcing only one thread at a time to load into the queue) ?

    I...
Results 1 to 11 of 11