Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Ralph Castain (rhc_at_[hidden])
Date: 2006-03-02 13:55:03


We expect to have much better support for the entire comm_spawn process in the next incarnation of the RTE. I don't expect that to be included in a release, however, until 1.1 (Jeff may be able to give you an estimate for when that will happen).

Jeff et al may be able to give you access to an early non-release version sooner, if better comm_spawn support is a critical issue and you don't mind being patient with the inevitable bugs in such versions.

Ralph


Edgar Gabriel wrote:
Open MPI currently does not fully support a proper disconnection of 
parent and child processes. Thus, if a child dies/aborts, the parents 
will abort as well, despite of calling MPI_Comm_disconnect. (The new RTE 
will have better support for these operations, Ralph/Jeff can probably 
give a better estimate when this will be available.)

However, what should not happen is, that if the child calls MPI_Finalize 
(so not a violent death but a proper shutdown), the parent goes down at 
the same time. Let me check that as well...

Brignone, Sergio wrote:

  
Hi everybody,

 

I am trying to run a master/slave set.

Because of the nature of the problem I need to start and stop (kill) 
some slaves.

The problem is that as soon as one of the slave dies, the master dies also.

 

This is what I am doing:

 

MASTER:

 

MPI_Init(...)

 

MPI_Comm_spawn(slave1,...,nslave1,...,intercomm1);

 

MPI_Barrier(intercomm1);

 

MPI_Comm_disconnect(&intercomm1);

 

MPI_Comm_spawn(slave2,...,nslave2,...,intercomm2);

 

MPI_Barrier(intercomm2);

 

MPI_Comm_disconnect(&intercomm2);

 

MPI_Finalize();

 

 

 

 

 

SLAVE:

 

MPI_Init(...)

 

MPI_Comm_get_parent(&intercomm);

 

(does something)

 

MPI_Barrier(intercomm);

 

MPI_Comm_disconnect(&intercomm);

 

 MPI_Finalize();

 

 

 

The issue is that as soon as the first set of slaves calls MPI_Finalize, 
the master dies also (it dies right after MPI_Comm_disconnect(&intercomm1) )

 

 

What am I doing wrong?

 

Thanks

 

Sergio

 

 


------------------------------------------------------------------------

_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
    


_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users