Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] How can I achieve node fail over
From: Josh Hursey (jjhursey_at_[hidden])
Date: 2010-01-11 15:43:36

On Jan 6, 2010, at 9:04 AM, Sai Sudheesh wrote:

> Hi,
> Just about two months ago I started experimenting with OpenMPI.
> I found this piece of software very interesting.
> How can I make this software fault tolerant?

Depends on what you mean my fault tolerant. :)

> As of now I am running this software on two machines
> having quad core processors and fedora 10.
> I am using openmpi1.3.2.
> If a remote machine fails while a parallel task running on both
> the machines
> is it possible to reassign that task assigned to it to some
> other node available and
> continue the computation instead of aborting the entire
> computation?

This scenario is currently not supported by Open MPI. If an MPI
process fails, Open MPI will cleanup the job.

A few of us have been working on this scenario off-trunk for a while
now. It is progressing nicely, but not available for public
consumption just yet.

> Can anybody tell me where I have to look for more information
> regarding this.
> I have tried with FT MPI but tired of it.

FT-MPI should be able to work in this scenario.

> I have also heard of CIFTS-FTB, can I use for solving this?

The CIFTS FTB is focused on a slightly different problem, that of
coordination amongst software components before/during/after a
failure. Currently, Open MPI is able to interact with the CIFTS FTB to
send fault information. Soon, Open MPI will be able to respond to such
fault information and take appropriate actions. The first generation
of this work is scheduled to be brought into the Open MPI trunk soon,
and will support catching of some basic events. Handling the scenario
you mentioned at the top of the message will come shortly thereafter.

> Is it necessary to make a source code change?

In some cases yes, in others no. It really depends on what the final
solution set looks like and how involved your application wants to be
in the recovery process. At the very least, the application will
likely have to specify the MPI_ERRORS_RETURN error handler for each
communicator to override the default MPI_ERRORS_ARE_FATAL.

> Have anybody a solution already with you?

There are a couple of transparent fault tolerance solutions in the
current trunk.
  - Checkpoint/Restart of the entire MPI job (requires full job
restart on failure)
  - Message Logging:

For non-MPI jobs you could also check out the Open Resilient Cluster
Manager (ORCM) project:

> If an application is killed by OS at the remote node
> mpirun is aborting and reports an error.
> What kind of signal the remote orted is to mpirun?
> How can I handle it?

I'm not sure what your asking here. The orted detects the local
process failure and notifies the mpirun process using the OOB (out-of-
band) communication channel. The mpirun process then initiates the
shutdown procedure.

-- Josh

> I know that I have asked a lot of questions..
> I will be thankful to you If anybody could respond with
> at least some suggestions.
> with love
> sudheesh.
> _______________________________________________
> devel mailing list
> devel_at_[hidden]