Search:

Type: Posts; User: Dithermaster

Page 1 of 7 1 2 3 4

Search: Search took 0.00 seconds.

  1. Replies
    1
    Views
    203

    In modern hardware, the case of all work items...

    In modern hardware, the case of all work items reading the same location (known as a "broadcast") does not cause a conflict and is therefore not serialized (and so fast).
  2. If you're using a modern C++ compiler with the...

    If you're using a modern C++ compiler with the standard library ("std"), you should use std::string instead of cl::string. I think they only put cl::string and cl::vector in there for older compilers.
  3. Replies
    2
    Views
    227

    to save folks time, I see this is already being...

    to save folks time, I see this is already being discussed elsewhere:

    http://stackoverflow.com/questions/25315102/opencl-compiler-error-c4996
  4. This is not valid in OpenCL 1.x; you can't have a...

    This is not valid in OpenCL 1.x; you can't have a buffer accessed from both CPU and device. You must use clEnqueueMapBuffer to get CPU access and clEnqueueUnmapMemObject to give access back to the...
  5. Try putting clFinish calls after every clEnqueue...

    Try putting clFinish calls after every clEnqueue call to narrow down the specific call that is causing a problem.
  6. There is an OpenCL 1.2 extension to create images...

    There is an OpenCL 1.2 extension to create images from buffers (cl_khr_image2d_from_buffer). However, images created this way perform slightly differently compared to regular images due to the linear...
  7. On any recent Mac OS X (10.6+) the runtime will...

    On any recent Mac OS X (10.6+) the runtime will always exist.

    On Windows you can use /DELAYLOAD and an alternate `clGetPlatformIDs` implementation that returns 0 for when the OS can't find...
  8. It's the second. __local is shared local...

    It's the second.

    __local is shared local memory, so it's a single piece of memory that every work item in a work group can access. It's essentially a programmer-managed cache, and frequent used...
  9. No, swap steps 1 and 2 so it becomes: 1. run...

    No, swap steps 1 and 2 so it becomes:

    1. run the kernel
    2. map the clbuffer
    3. use the pointer retuned from maping to go through the buffer and print each item
    4: Unmap the buffer

    Mapping...
  10. You don't use the contents after...

    You don't use the contents after clEnqueueUnmapMemObject.

    Instead of clEnqueueReadBuffer, do a clEnqueueMapBuffer (with blocking), use the pointer returned to access the buffer, then...
  11. For many applications, yes. You can certainly try...

    For many applications, yes. You can certainly try to write a function that calculates an optimal work group size, but it will be a challenge. Alternatively, you can benchmark all sizes on the user's...
  12. I'd love to be proven wrong, but in my opinion...

    I'd love to be proven wrong, but in my opinion and based on my experience, it's a black art.

    It varies by hardware vendor, and I've even seen where non-multiples of...
  13. Replies
    5
    Views
    574

    As I said, look it up in cl.h: #define...

    As I said, look it up in cl.h:

    #define CL_INVALID_VALUE -30

    Then look in the OpenCL specification for the API that is returning that error to see what it means.

    On...
  14. Replies
    5
    Views
    574

    It's always helpful to look at the error code you...

    It's always helpful to look at the error code you get back from OpenCL APIs. For example, what code to you get back from clGetDeviceIDs? Look it up in cl.h to get a clue as to what is happening.
  15. Replies
    4
    Views
    565

    OpenCL 1.x doesn't have a continuous data...

    OpenCL 1.x doesn't have a continuous data streaming mode so you'll need to chop up your data into blocks and upload them one by one, process them, and download results. On modern hardware you'll be...
  16. Replies
    6
    Views
    761

    > If you specify a work group size larger than...

    > If you specify a work group size larger than your hardware or kernel supports, the clEnqueueNDRange call should fail and return an error code.
    That would be nice but I don't think you can reliably...
  17. Replies
    7
    Views
    728

    Just an observation: The GlobalWorkSize is not an...

    Just an observation: The GlobalWorkSize is not an integer multiple of the WorkGroupSize (216 is not evenly divisible by 16). In OpenCL 1.x, if you specify the work group size then the global size...
  18. Replies
    2
    Views
    409

    Use a "buffer" in OpenCL global device memory.

    Use a "buffer" in OpenCL global device memory.
  19. Not yet because AMD has not shipped an OpenCL 2.0...

    Not yet because AMD has not shipped an OpenCL 2.0 driver. When (or if) they do, it is up to AMD as to whether it will support the Radeon HD 6670 or only newer hardware.
  20. Replies
    1
    Views
    467

    Use an OpenCL Buffer object. It will retain value...

    Use an OpenCL Buffer object. It will retain value between kernel invocations and can be accessed with a pointer.
  21. Replies
    1
    Views
    670

    Not yet. The conformance tests were only recently...

    Not yet. The conformance tests were only recently completed.
  22. Replies
    7
    Views
    904

    Oh, that's Java? Sorry, didn't catch that. Well,...

    Oh, that's Java? Sorry, didn't catch that. Well, I can't help you much then except to still say that something is leaking host memory. Do you see Task Manager memory usage grow as your application...
  23. Replies
    7
    Views
    904

    You are leaking host memory in initGPU. If you...

    You are leaking host memory in initGPU. If you run Task Manager you'll see that your memory usage just keeps growing. Everything that has a "new" must have a matching "delete".
  24. Is this a quiz or a mind reading exercise? What...

    Is this a quiz or a mind reading exercise? What pre-release OpenCL platform are you using (since none have shipped yet), what results do you expect, and what errors or incorrect results are you...
  25. Replies
    3
    Views
    588

    AJ's suggestion is great. What I've done is...

    AJ's suggestion is great. What I've done is comment out various parts of the kernel, along the lines of "if this part was 'free' how fast would it run?". You can do this separately for reads, compute...
Results 1 to 25 of 160
Page 1 of 7 1 2 3 4