Search:

Type: Posts; User: zvivered

Search: Search took 0.00 seconds.

  1. Cl_device_preferred_vector_width_float = 1

    Hello,

    I ran:
    clGetDeviceInfo(dev /* device */,
    CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT /* param_name */,
    sizeof(cl_uint) /* param_value_size */, ...
  2. Low performance when writing results back to global memory

    Hello,

    We wrote a host code that runs a very simple CL code.

    The CL code is:

    __kernel void id_check(__global float *in,int n,__local float *out)
    {
    for (i=0;i<n;i++)
    {
  3. Hello, I found the problem. I should change...

    Hello,

    I found the problem.
    I should change global_size to:
    size_t global_size[] = {4 , 256 * 32 / 4};

    Thanks,
    Zvika
  4. Moving from float to float4: What should be changed in the host code ?

    Hello,

    My input to the kernel is 4 x 2D matrices each contains 256x32 float numbers.
    The size of the output is the same.

    So in the host I called to:


    size_t dim = 2;
    size_t...
  5. Replies
    4
    Views
    825

    This is the way to do it according "OpenCL in...

    This is the way to do it according "OpenCL in Action" by Matthew Scarpino:

    cl_event prof_event;
    cl_ulong time_start, time_end, total_time;
    ....
    queue = clCreateCommandQueue(context, device, ...
  6. Replies
    4
    Views
    825

    How to measure GPU performance

    Hello,

    I want to measure the performance of the GPU in NVIDIA's GeForce 9400 GT

    The steps in the host code are:

    clSetKernelArg

    clCreateCommandQueue
  7. You are right ! The line that puts the data in...

    You are right !

    The line that puts the data in the output buffer is:
    output[index] = f;

    If the output changes to 'float4 *' it should be handled accordingly.
    Thanks,
    Zvika
  8. Replies
    1
    Views
    527

    Cache miss in kernel

    Hello,

    Should I consider the caches of a single core ?

    The input data is 2 3D matrices each contains 16x256x16 elements.

    When the core access the data is does it slowly.

    So I guess I...
  9. get_global_id(0) changes when moving from float * to float4 * in kernel

    Hello,

    I'm running the following code from "OpenCL in action":


    __kernel void id_check(__global float *output) {

    /* Access work-item/work-group information */
    size_t global_id_0 =...
  10. Replies
    2
    Views
    810

    syntax of 'dot' routine

    Hello,

    Following is a code taken from "OpenCL in action".

    __kernel void matvec_mult(__global float4* matrix,
    __global float4* vector,
    ...
  11. Dear Dithermaster, In my project the input to...

    Dear Dithermaster,

    In my project the input to the kernel in N(rows)xM(columns) float matrix.
    N,M will be passed as arguments to the kernel.

    The input contains also a vector with N elements....
  12. How matrix dimentions are passed to kernel ?

    Hello,

    The following kernel is used to multiply matrix by vector. Is taken from the book: "OpenCL in action"

    __kernel void matvec_mult(__global float4* matrix,
    ...
  13. Replies
    2
    Views
    854

    Dear Dithermaster, Thank you for your help ! ...

    Dear Dithermaster,

    Thank you for your help !

    Regards,
    Zvika
  14. Replies
    2
    Views
    854

    for loops in kernel

    Hello,

    According to the book "OpenCL in action":

    "Comparisons are time-consumingon the best of processors, but they’re especially slow on dedicated number-crunchers like graphic processor units...
  15. NVIDIA's GeForce 9400 GT: CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT = 1

    Hello,

    On my device (NVIDIA's GeForce 9400 GT) I ran the following command:


    clGetDeviceInfo(devices[i], CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT,
    sizeof(vec_width), &vec_width,...
  16. Hi James, Your help is highly appreciated. ...

    Hi James,

    Your help is highly appreciated.

    Thanks,
    Zvika
  17. Running with AMD's OpenCL SDK on NVIDIA's device

    Hello,

    I tried the first sample from the book "OpenCL in action" by Matthew Scarpino.
    The code finds the devices installed and runs a matrix multiplication (4x4 * 4x1)

    The device I have is...
  18. Replies
    2
    Views
    1,370

    Hello clint3112, Thank you for your help ! ...

    Hello clint3112,

    Thank you for your help !

    Zvika
  19. Replies
    2
    Views
    1,370

    OpenCL vs. CUDA for NVIDIA GPGPU

    Hello,

    Is it true that CUDA code runs on NVIDIA GPGPU is much more efficient\fast
    compared to an OpenCL code (that does the same computations) that runs on the same GPGPU ?

    Thanks,
    Zvika
  20. Hello, If Open GL is not used (just WPF or...

    Hello,

    If Open GL is not used (just WPF or GDI+) can I be sure that GPU is free ?

    Thanks,
    Zvika
  21. How can I know that GPU is free for my Open CL code and not for other tasks ?

    Hello,

    I want to use Open CL to run some computations and use AMD's GPU.
    But this GPU is also used to display the GUI of the application which contains images, menus etc.
    The libraries used for...
  22. Replies
    1
    Views
    870

    Open CL C library

    Hello,

    I'm trying to use AMD's Open CL library.
    Is it possible to write the code in C without C++ templates ?
    I'm looking for a pure C library.

    Thanks,
    Zvika
Results 1 to 22 of 24