Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] about MPI
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2010-06-29 07:10:45


On Jun 29, 2010, at 3:44 AM, 王睿 wrote:

> 1, suppose a MPI program involves several nodes, if one node dead, will the program terminate?

Open MPI will terminate the whole job, yes.

> 2, Is there any possibility to extend or shrink the size of MPI communicator size? If so, we can use spare node to replace the dead node?

Currently, no.

Fault tolerance and resiliency is an active topic of research and discussion in the MPI-3 forum. But for the moment, most MPI implementations -- including Open MPI -- have fairly draconian responses to the loss of a process and/or node (i.e., kill the rest of the job).

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/