Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Detecting Node Failure
From: Ralph Castain (rhc_at_[hidden])
Date: 2013-06-20 20:46:03


We will also be supporting that in the developer's trunk fairly soon, and
that will appear later on in the 1.9 series.

On Thu, Jun 20, 2013 at 4:18 PM, Jeff Squyres (jsquyres) <jsquyres_at_[hidden]
> wrote:

> Not at present, no.
>
> But you might want to look at a fork of the OMPI code base that was
> exploring fault resilience issues:
>
> http://fault-tolerance.org/
>
>
> On Jun 20, 2013, at 5:57 PM, Andreas Schäfer <gentryx_at_[hidden]>
> wrote:
>
> > On 14:59 Thu 20 Jun , Ralph Castain wrote:
> >> It should detect and abort - what version are you using?
> >
> > Would it be possible to call MPI_Comm_disconnect() in the case the
> > communicator in question is an intercom -- without having OMPI abort?
> >
> > I'm asking because if we had a possibility to dynamically
> > connect/disconnect nodes in a robust way, then we could build
> > fault-resilient apps on top of that.
> >
> > Best
> > -Andreas
> >
> >
> > --
> > ==========================================================
> > Andreas Schäfer
> > HPC and Grid Computing
> > Chair of Computer Science 3
> > Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
> > +49 9131 85-27910
> > PGP/GPG key via keyserver
> > http://www.libgeodecomp.org
> > ==========================================================
> >
> > (\___/)
> > (+'.'+)
> > (")_(")
> > This is Bunny. Copy and paste Bunny into your
> > signature to help him gain world domination!
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>