Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] mpiexec option for node failure
From: Ralph Castain (rhc_at_[hidden])
Date: 2011-09-12 20:43:48

We don't have anything similar in OMPI. There are fault tolerance modes, but not like the one you describe.

On Sep 12, 2011, at 5:52 PM, Rob Stewart wrote:

> Hi,
> I have implemented a simple fault tolerant ping pong C program with MPI, here:
> MPICH2 offers a parameter with mpiexec:
> $ mpiexec -disable-auto-cleanup
> .. as described here:
> It is fault tolerant in the respect that, when I ssh to one of the nodes in the hosts file, and kill the relevant process, the MPI job is not terminated. Simply, the ping will not prompt a pong from the dead node, but the ping-pong runs forever on the remaining live nodes.
> Is such an feature available for openMPI, either via mpiexec or some other means?
> --
> Rob Stewart
> _______________________________________________
> users mailing list
> users_at_[hidden]