Search:

Type: Posts; User: dominik

Page 1 of 3 1 2 3

Search: Search took 0.00 seconds.

  1. Re: Multiple multicore CPUs and Fusion or Sandy Bridge

    Oh, I see. That's an interesting point. I didn't know there were multi-socket Fusion systems. Please let us know when you know more.
  2. Re: Multiple multicore CPUs and Fusion or Sandy Bridge

    Do you mean that the CPU cores and the GPU will be exposed as one device in OpenCL?
    I very much doubt that... Distributing work across multiple CPUs is easy because they're generally homogeneous...
  3. Re: OpenCL on Linux, which implementation to choose?

    I have to do that for the user that's logged in locally. But then I can only see the GPUs remotely if I log in as the same user, but not as a different user.
  4. Re: OpenCL on Linux, which implementation to choose?

    That doesn't work unfortunately. The CPU is still the only device I see in OpenCL.
  5. Re: OpenCL on Linux, which implementation to choose?

    Hi,

    I've got a Linux box with two ATI Radeon HD 5970 cards (2 GPUs each). I'm running Ubuntu and it all works fine. It was initially a bit tricky to set up so that I was able to use all 4 GPUs but...
  6. Thread: Events usage

    by dominik
    Replies
    4
    Views
    1,516

    Re: Events usage

    Sorry, I thought you were asking why en event is still "valid" after the associated command has been executed.

    Have you tried using clReleaseEvent() ? At the end of each iteration you could...
  7. Thread: Events usage

    by dominik
    Replies
    4
    Views
    1,516

    Re: Events usage

    You can use events to get profiling information (see clGetEventProfilingInfo), e.g. you can use it to compute the execution time of a command after it has finished.
  8. Replies
    3
    Views
    1,583

    Re: Weird result with unsigned char type

    What datatype are you actually working with? Are you working on an array of unsigned char or an array of float?

    When you use char, each work-item only copies one byte at a time...
  9. Replies
    3
    Views
    2,507

    Re: Implementation of OpenCL in clang

    As far as I know clang currently does not support OpenCL. There are some efforts of patching clang to support OpenCL that I've heard of, but it's not made it into mainstream clang yet and it seems to...
  10. Re: OpenCL with Python(interpreted) or C/C++(compiled) ?

    PyOpenCL seems to be a wrapper around the OpenCL API. You will still call the "original" OpenCL library functions and your kernels will get executed just like in C/C++.
    It therefore shouldn't make...
  11. Re: GPU compatibility question. HIS H695F2G2M Radeon HD 6950

    Hi James,

    the GPU you're looking at is an AMD Radeon HD6950 and as such supports OpenCL 1.1 using AMD's Stream SDK (see http://developer.amd.com/gpu/AMDAPPSDK/ ... ility.aspx).
    As far as I know,...
  12. Replies
    2
    Views
    2,823

    Re: global work offset in OpenCL 1.1

    The number of work-items is also not affected by the offset, but the work-item IDs are...


    Say you want to compute a matrix-vector multiplication where one work-group operates on each row. Then...
  13. Replies
    2
    Views
    2,823

    global work offset in OpenCL 1.1

    Hi,

    with OpenCL 1.1 it is possible to define an offset to your NDRange when launching a kernel. However, according to the spec (see 3.2) this offset is only affecting the global ID, but not the...
  14. Replies
    2
    Views
    1,088

    Re: Global Memory - when is it changed?

    I strongly doubt that the compiler will optimize this. If the writes were to the same location in each iteration (with += for example) I could image that the compiler will optimize that. But since...
  15. Replies
    4
    Views
    2,664

    Re: Compile error for float4 array

    Try this:

    __constant float4 splitter_cache[2] = { 0.0f, 0.0f };
    I think float gets promoted to float4 in this assignment, i.e. all components of float4 will be initialized to 0. Haven't tried it...
  16. Replies
    4
    Views
    1,555

    Re: an opencl puzzle

    No, OpenCL doesn't help your here. It's your job to copy data between devices and host. All you can do is keep the memory transfers to a minimum when you write your programs.
  17. Replies
    4
    Views
    1,555

    Re: an opencl puzzle

    He initializes pBuf to zero (see third line of code: zeromemory(pBuf, nCount);)

    Did you check the return values of the functions?
  18. Replies
    3
    Views
    1,855

    Re: Calling a kernel from within a kernel?

    I am sure I could. But that's mostly serial stuff.[/quote]
    If you can run it on the GPU you wouldn't have the overhead of transferring your data all the time. So even if it's not optimal to run the...
  19. Replies
    3
    Views
    1,855

    Re: Calling a kernel from within a kernel?

    As far as I understand it it's all about implementing step 4 for the GPU. Once you've got that you can skip steps 3 and 5, because you've already got the data where you need it.

    I'm not sure if...
  20. Replies
    2
    Views
    2,215

    Re: Open CL with OpenMP

    The only benefit I can think of is that you have one thread for each GPU, so you can have a dedicated CPU core for each GPU, which might be faster than managing all the GPUs with just one core.
    I...
  21. Replies
    2
    Views
    1,408

    Re: Question of openCL use of gpu and cpu

    As opposed to CUDA kernels, you can run OpenCL kernels on GPUs and CPUs. So you don't have to run all your algorithms on the GPU. If you have several tasks for example, you can run some of them on...
  22. Replies
    2
    Views
    1,794

    Re: OpenCL system, device type

    Have a look at "device fission" in AMD's OpenCL implementation, e.g. here
  23. Replies
    1
    Views
    2,281

    Re: ATI and NVIDIA from OpenCL perspective

    I think in many respects the architectures are similar. Both, ATI Stream and NVIDIA CUDA are SIMD architectures, i.e. divergent branches are expensive. Also memory coalescing is important on both...
  24. Re: how to pass an arbirary length constant array to a kernel?

    It looks like you're doing it the right way...

    How big is your buffer? Constant memory is usually limited to a few kilobytes (you can check it by querying CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE)
    ...
  25. Re: Limitation of local sizes on N-Dimensional NDRanged problems

    That's right. See section 5.6 of the spec.
Results 1 to 25 of 74
Page 1 of 3 1 2 3