Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] about MPI
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2010-06-30 08:27:56


On Jun 29, 2010, at 9:35 PM, Íõî£ wrote:

> Thanks for the feedback. More below:
>
> Is there any MPI implementions which meet the following requirements:
>
> 1, it doesn't terminate the whole job when a node is dead?
>
> 2, it allows the spare node to replace the dead node and take over the work of the dead node?
>
> As far as I know, FT-MPI meets the two requirements, but it hasn't updated since 2004. Open-mpi is said to combine serveral projects including FT-MPI, but so far, it only provides checkpoinr/restart as a way of fault-tolerance.

I know that the UT people have been working on such things over the past few years, but I don't know the current status.

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/