Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Edgar Gabriel (gabriel_at_[hidden])
Date: 2006-03-02 12:19:05


Open MPI currently does not fully support a proper disconnection of
parent and child processes. Thus, if a child dies/aborts, the parents
will abort as well, despite of calling MPI_Comm_disconnect. (The new RTE
will have better support for these operations, Ralph/Jeff can probably
give a better estimate when this will be available.)

However, what should not happen is, that if the child calls MPI_Finalize
(so not a violent death but a proper shutdown), the parent goes down at
the same time. Let me check that as well...

Brignone, Sergio wrote:

> Hi everybody,
>
>
>
> I am trying to run a master/slave set.
>
> Because of the nature of the problem I need to start and stop (kill)
> some slaves.
>
> The problem is that as soon as one of the slave dies, the master dies also.
>
>
>
> This is what I am doing:
>
>
>
> MASTER:
>
>
>
> MPI_Init(...)
>
>
>
> MPI_Comm_spawn(slave1,...,nslave1,...,intercomm1);
>
>
>
> MPI_Barrier(intercomm1);
>
>
>
> MPI_Comm_disconnect(&intercomm1);
>
>
>
> MPI_Comm_spawn(slave2,...,nslave2,...,intercomm2);
>
>
>
> MPI_Barrier(intercomm2);
>
>
>
> MPI_Comm_disconnect(&intercomm2);
>
>
>
> MPI_Finalize();
>
>
>
>
>
>
>
>
>
>
>
> SLAVE:
>
>
>
> MPI_Init(...)
>
>
>
> MPI_Comm_get_parent(&intercomm);
>
>
>
> (does something)
>
>
>
> MPI_Barrier(intercomm);
>
>
>
> MPI_Comm_disconnect(&intercomm);
>
>
>
> MPI_Finalize();
>
>
>
>
>
>
>
> The issue is that as soon as the first set of slaves calls MPI_Finalize,
> the master dies also (it dies right after MPI_Comm_disconnect(&intercomm1) )
>
>
>
>
>
> What am I doing wrong?
>
>
>
> Thanks
>
>
>
> Sergio
>
>
>
>
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users