Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] mpiexec option for node failure
From: Ralph Castain (rhc_at_[hidden])
Date: 2011-09-12 20:43:48


We don't have anything similar in OMPI. There are fault tolerance modes, but not like the one you describe.

On Sep 12, 2011, at 5:52 PM, Rob Stewart wrote:

> Hi,
>
> I have implemented a simple fault tolerant ping pong C program with MPI, here: http://pastebin.com/7mtmQH2q
>
> MPICH2 offers a parameter with mpiexec:
> $ mpiexec -disable-auto-cleanup
>
> .. as described here: http://trac.mcs.anl.gov/projects/mpich2/ticket/1421
>
> It is fault tolerant in the respect that, when I ssh to one of the nodes in the hosts file, and kill the relevant process, the MPI job is not terminated. Simply, the ping will not prompt a pong from the dead node, but the ping-pong runs forever on the remaining live nodes.
>
> Is such an feature available for openMPI, either via mpiexec or some other means?
>
>
> --
> Rob Stewart
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users