Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] fault tolerance in open mpi
From: Josh Hursey (jjhursey_at_[hidden])
Date: 2009-08-03 11:07:53

Task-farm or manager/worker recovery models typically depend on
intercommunicators (i.e., from MPI_Comm_spawn) and a resilient MPI
implementation. William Gropp and Ewing Lusk have a paper entitled
"Fault Tolerance in MPI Programs" that outlines how an application
might take advantage of these features in order to recover from
process failure.

However, these techniques strongly depend upon resilient MPI
implementations, and behaviors that, some may argue, are non-standard.
Unfortunately there are not many MPI implementations that are
sufficiently resilient in the face of process failure to support
failure in task-farm scenarios. Though Open MPI supports the current
MPI 2.1 standard, it is not as resilient to process failure as it
could be.

There are a number of people working on improving the resiliency of
Open MPI in the face of network and process failure (including
myself). We have started to move some of the resiliency work into the
Open MPI trunk. Resiliency in Open MPI has been improving over the
past few months, but I would not assess it as ready quite yet. Most of
the work has focused on the runtime level (ORTE), and there are still
some MPI level (OMPI) issues that need to be worked out.

With all of that being said, I would try some of the techniques
presented in the Gropp/Lusk paper in your application. Then test it
with Open MPI and let us know how it goes.


On Aug 3, 2009, at 10:30 AM, Durga Choudhury wrote:

> Is that kind of approach possible within an MPI framework? Perhaps a
> grid approach would be better. More experienced people, speak up,
> please?
> (The reason I say that is that I too am interested in the solution of
> that kind of problem, where an individual blade of a blade server
> fails and correcting for that failure on the fly is better than taking
> checkpoints and restarting the whole process excluding the failed
> blade.
> Durga
> On Mon, Aug 3, 2009 at 9:21 AM, jody<jody.xha_at_[hidden]> wrote:
>> Hi
>> I guess "task-farming" could give you a certain amount of the kind of
>> fault-tolerance you want.
>> (i.e. a master process distributes tasks to idle slave processors -
>> however, this will only work
>> if the slave processes don't need to communicate with each other)
>> Jody
>> On Mon, Aug 3, 2009 at 1:24 PM, vipin kumar<vipinkumar41_at_[hidden]>
>> wrote:
>>> Hi all,
>>> Thanks Durga for your reply.
>>> Jeff, once you wrote code for Mandelbrot set to demonstrate fault
>>> tolerance
>>> in LAM-MPI. i. e. killing any slave process doesn't
>>> affect others. Exact behaviour I am looking for in Open MPI. I
>>> attempted,
>>> but no luck. Can you please tell how to write such programs in
>>> Open MPI.
>>> Thanks in advance.
>>> Regards,
>>> On Thu, Jul 9, 2009 at 8:30 PM, Durga Choudhury
>>> <dpchoudh_at_[hidden]> wrote:
>>>> Although I have perhaps the least experience on the topic in this
>>>> list, I will take a shot; more experienced people, please correct
>>>> me:
>>>> MPI standards specify communication mechanism, not fault
>>>> tolerance at
>>>> any level. You may achieve network tolerance at the IP level by
>>>> implementing 'equal cost multipath' routes (which means two equally
>>>> capable NIC cards connecting to the same destination and
>>>> modifying the
>>>> kernel routing table to use both cards; the kernel will dynamically
>>>> load balance.). At the MAC level, you can achieve the same effect
>>>> by
>>>> trunking multiple network cards.
>>>> You can achieve process level fault tolerance by a checkpointing
>>>> scheme such as BLCR, which has been tested to work with OpenMPI
>>>> (and
>>>> other processes as well)
>>>> Durga
>>>> On Thu, Jul 9, 2009 at 4:57 AM, vipin
>>>> kumar<vipinkumar41_at_[hidden]> wrote:
>>>>> Hi all,
>>>>> I want to know whether open mpi supports Network and process fault
>>>>> tolerance
>>>>> or not? If there is any example demonstrating these features
>>>>> that will
>>>>> be
>>>>> best.
>>>>> Regards,
>>>>> --
>>>>> Vipin K.
>>>>> Research Engineer,
>>>>> C-DOTB, India
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>> --
>>> Vipin K.
>>> Research Engineer,
>>> C-DOTB, India
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
> _______________________________________________
> users mailing list
> users_at_[hidden]