Results 1 to 6 of 6

Thread: Reliable distributed computing w/ OpenCL, anyone doing this?

  1. #1
    Junior Member
    Join Date
    Sep 2009
    Posts
    2

    Reliable distributed computing w/ OpenCL, anyone doing this?

    I have been working on gettting a topic for my master's thesis. I have been mostly on the hardware side, but covering topics like computer architecture, fault tolerance and reliability. The latter are topics of interest to me. For the past 2 years, I have been working as a software engineer for a company that uses exclusively macs and was curious about OpenCL.
    I started looking into it and it seems great for serious computation, eventually I imagine people would want to use it as the "horse power" for distributed computer systems.
    I don't want to bore anyone to death, and would appreciate anyone that might have any info to get me going. If anyone knows of any projects around the area of reliable distributed computing (preferably using OpenCL), please let me know. Like for example, the platform model for OpenCL doesn't seem to be well versed for recovery/reliability. It would be great to find some more info about this. \
    Thanks.

  2. #2
    Senior Member
    Join Date
    Jul 2009
    Location
    Northern Europe
    Posts
    311

    Re: Reliable distributed computing w/ OpenCL, anyone doing this?

    I don't think you are going to find much out there about this with OpenCL for two reasons. The first is that OpenCL is new and the second is that high-end GPGPU work is also relatively new.

    You could certainly build a framework around OpenCL where you checkpoint results by calling clFinish() periodically and reading back all cl_mem objects from the device to the host memory. This would allow you to recover from transient failures, but I don't think anyone has done this.

  3. #3
    Junior Member
    Join Date
    Sep 2009
    Posts
    2

    Re: Reliable distributed computing w/ OpenCL, anyone doing this?

    I would imagine it wouldn't be much out there. It's actually great since I can work on anything and not worry about it being already done. How about any similar projects in CUDA or anything of that sort? It might give me a starting point.
    Thank you for the tip with clFinish(). I have been leaning against that exact topic of creating a "reliability framework".

  4. #4
    Junior Member
    Join Date
    Apr 2010
    Posts
    1

    Re: Reliable distributed computing w/ OpenCL, anyone doing this?

    Indeed, it's hard to find something about distributed OpenCL. But it exists!

    I began a project, a year ago, to create a open source distributed OpenCL framework. It's in this early stages, but I already have a lot of code done. The progress is not the best one, since I'm working alone, but I'm trying to get some collaboration.

    If you want to take a look: http://sourceforge.net/projects/openclgrid/

  5. #5

    Re: Reliable distributed computing w/ OpenCL, anyone doing this?

    There's also another project being talked about here: viewtopic.php?f=40&t=2536&start=0

    http://www.sourceforge.net/projects/clara/

  6. #6
    Junior Member
    Join Date
    Nov 2009
    Posts
    14

    Re: Reliable distributed computing w/ OpenCL, anyone doing t

    Now two years later, is there anyone working on a checkpoint/restart facility for OpenCL?

    How about adding a checkpoint/restart feature to the OpenCL standard?

    The idea is of a function call that stores in memory everything about a given OpenCL context: the program can then relinquish the GPU(s) and continue later from where it was. This is a standard feature of CPU programs, but currently to my knowledge has no GPU equivalent.

Similar Threads

  1. OPENCL distributed computing.
    By ashwath in forum OpenCL
    Replies: 11
    Last Post: 01-18-2013, 04:03 AM
  2. OpenCL on distributed systems.
    By LucasCampos in forum OpenCL
    Replies: 4
    Last Post: 06-02-2011, 12:38 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •