Search:

Type: Posts; User: chippies

Page 1 of 4 1 2 3 4

Search: Search took 0.00 seconds.

  1. Re: copying a variable from host memory to device memory

    Since you want to copy the result back to host memory, you will have to store the value in a cl_mem object. You will have to allocate an array of one element and an OpenCL buffer with enough bytes...
  2. Replies
    2
    Views
    1,251

    Re: 2D Arrays in local functions

    There is no new keyword in OpenCL. So the syntax will always be int oneDArray[100] for a 1D array, int twoDArray[5][5] for a 2D array, etc. Hence you would want to use uint z[5][5][5] I think. You...
  3. Re: enequeueNDRangeKernel - parallel execution on OpenCL dev

    You are right that you can enqueue the kernels in a loop over all the devices. This will result in the devices processing the data concurrently.

    You find out when a kernel has finished by using...
  4. Replies
    2
    Views
    1,051

    Re: OpenCL not finding devices as dll

    I don't have experience with your problem, but perhaps it would be easier if you used the exisiting .Net wrappers:
    Cloo: http://sourceforge.net/projects/cloo/
    The Open Toolkit library:...
  5. Replies
    3
    Views
    1,480

    Re: System freeze on kernel execution

    It is absolutely normal for this to cause your system to freeze. Your original code would access some part of GPU memory and write to it. This could overwrite some piece of data that is used by a...
  6. Replies
    3
    Views
    1,480

    Re: System freeze on kernel execution

    Your problem lies here:



    character = (unsigned char *)odata[0];


    This code takes the integer stored at location 0 in odata and casts it to a pointer to an unsigned char. I think you wanted...
  7. Replies
    2
    Views
    981

    Re: subbuffes + 1.0

    That sounds like a bug in the OpenCL implementation. Perhaps filing a bug report on the Khronos bug tracker will help, but since it is Nvidia, there is little hope for that.
  8. Replies
    12
    Views
    3,400

    Re: how to run on cpu graphics card

    It will run on the CPU. If you want the GPU that is built into the CPU then you must specify CL_DEVICE_TYPE_GPU.
  9. Replies
    2
    Views
    1,183

    Re: regarding some problems

    Why is vectorisation fast: lets say your CPU can process vectors of 4 float values in a single instruction. That means 4 operations get done at once. If you don't vectorise your code then the CPU...
  10. Re: Can i call the same kernel function multiple times in a

    You can call a kernel function an unlimited number of times within any sort of loops structure.
  11. Thread: vectorization

    by chippies
    Replies
    5
    Views
    1,441

    Re: vectorization

    You might find it worthwhile to look up the vload* and vstore* functions. In your case, replace * with 16.
  12. Replies
    11
    Views
    3,615

    Re: OPENCL distributed computing.

    You can try VCL (http://www.mosix.org/txt_vcl.html) but that is specific to the Mosix Linux distribution.

    Alternatively, you can try using MPI to write distributed apps that run over multiple PCs...
  13. Re: Scheduling a work load that relies on sequential values

    If you can schedule your work over multiple threads when not using OpenCL then I don't see why you can't use multiple work items. That is the way to use all of the cores on your CPU.
  14. Replies
    3
    Views
    2,561

    Re: OpenCL Kernel thread execution time

    There is no direct way of measuring the time taken by each thread.

    Estimating the time per thread might be possible for realy simple kernels but requires assumptions about how the hardware works....
  15. Replies
    4
    Views
    2,261

    Re: OpenCL slow compiling on AMD card

    Nvidia has spent many more years on the various parts of their compiler architecture than what I think AMD has, hence I am not too surprised that AMD's compiler is slower.
  16. Replies
    1
    Views
    1,241

    Re: Device lost possible?

    I don't see any errors that are specific to you scenario, which makes me think that you could get any random error and that this will vary by vendor. CL_INVALID_DEVICE and CL_INVALID_PLATFORM might...
  17. Replies
    10
    Views
    4,393

    Re: can we use structure in opencl

    Just a note for getting the alignment right, I think you should be using the cl_* types, i.e. cl_char, cl_float, etc. for the fields in your structure. Looking at cl_platform.h shows me that...
  18. Re: clEnqueueReadBuffer gives the error CL_OUT_OF_RESOURCES

    Hi bajil, I have had the same inexplicable error on a perfectly valid clEnqueueReadBuffer call before on my GeForce GTX 560 Ti. I have always found that it was as a result of one of my kernels...
  19. Re: openCL and VC++ 2010 “front end compiler failed build”

    Without seeing your kernel source code, I can only guess, but clBuildProgram should not be giving you any error about not finding stdio.h because it should not be look for it. please don't put...
  20. Replies
    2
    Views
    2,540

    Re: row_pitch mishandled in ATI Radeon HD 7970

    You should post this with a complete minimal example reproducing the bug on the AMD OpenCL forums.
  21. Re: Performance of clEnqueueReadBuffer on different HW syste

    You don't mention the configuration of the system that is fast. The first thing that comes to mind is to ask what other software is running in the background on the Supermicro system? The other...
  22. Replies
    4
    Views
    2,261

    Re: OpenCL slow compiling on AMD card

    Gopal_HC are you using loop unrolling on that large loop? It seems a little odd that just changing the number of iterations would affect the compile time unless the loop is being unrolled. If it is...
  23. Replies
    2
    Views
    1,565

    Re: FFT 2D kernel runtime =0 in OpenCL

    I see that your code gets the error code generated by every OpenCL function call but you don't seem to check whether that code actually equals CL_SUCCESS. I would suggest that you add such a check...
  24. Replies
    6
    Views
    4,045

    Re: basic question regarding get_global_id

    get_global_id returns the number for the current thread. The parameter is just the dimension of the array of threads. When you enqueue a kernel, one of the parameters is an int array...
  25. Replies
    1
    Views
    1,120

    Re: GPU never works

    This link might help with disabling caching of your kernels: https://devtalk.nvidia.com/default/topic/499681/disable-caching-by-the-opencl-compiler/.

    As for the other errors, all I can advise is...
Results 1 to 25 of 86
Page 1 of 4 1 2 3 4