Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

From: Ralph Castain (rhc_at_[hidden])
Date: 2006-03-02 13:55:03


We expect to have much better support for the entire comm_spawn process in the next incarnation of the RTE. I don't expect that to be included in a release, however, until 1.1 (Jeff may be able to give you an estimate for when that will happen).

Jeff et al may be able to give you access to an early non-release version sooner, if better comm_spawn support is a critical issue and you don't mind being patient with the inevitable bugs in such versions.

Ralph


Edgar Gabriel wrote:
Open MPI currently does not fully support a proper disconnection of 
parent and child processes. Thus, if a child dies/aborts, the parents 
will abort as well, despite of calling MPI_Comm_disconnect. (The new RTE 
will have better support for these operations, Ralph/Jeff can probably 
give a better estimate when this will be available.)

However, what should not happen is, that if the child calls MPI_Finalize 
(so not a violent death but a proper shutdown), the parent goes down at 
the same time. Let me check that as well...

Brignone, Sergio wrote:

  
Hi everybody,

 

I am trying to run a master/slave set.

Because of the nature of the problem I need to start and stop (kill) 
some slaves.

The problem is that as soon as one of the slave dies, the master dies also.

 

This is what I am doing:

 

MASTER:

 

MPI_Init(...)

 

MPI_Comm_spawn(slave1,...,nslave1,...,intercomm1);

 

MPI_Barrier(intercomm1);

 

MPI_Comm_disconnect(&intercomm1);

 

MPI_Comm_spawn(slave2,...,nslave2,...,intercomm2);

 

MPI_Barrier(intercomm2);

 

MPI_Comm_disconnect(&intercomm2);

 

MPI_Finalize();

 

 

 

 

 

SLAVE:

 

MPI_Init(...)

 

MPI_Comm_get_parent(&intercomm);

 

(does something)

 

MPI_Barrier(intercomm);

 

MPI_Comm_disconnect(&intercomm);

 

 MPI_Finalize();

 

 

 

The issue is that as soon as the first set of slaves calls MPI_Finalize, 
the master dies also (it dies right after MPI_Comm_disconnect(&intercomm1) )

 

 

What am I doing wrong?

 

Thanks

 

Sergio

 

 


------------------------------------------------------------------------

_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
    


_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users