I'm afraid there is no option to keep the job alive if the parent exits. I could give you several reasons for that behavior, but the bottom line is that it can't be changed.

Why don't you have the parent loop across "sleep", waking up periodically to check for a "we are done" message from a child? That would take essentially no CPU and meet your need.

On Jun 16, 2012, at 6:03 AM, Roland Schulz wrote:


I would like to start a single process without mpirun and then use MPI_Comm_spawn to start up as many processes as required. I don't want the parent process to take up any resources, so I tried to disconnect the inter communicator and then finalize mpi and exit the parent. But as soon as I do that the children exit too. Why is that? Can I somehow change that behavior? Or can I wait on the children to exit without the waiting taking up CPU time?

The reason I don't need the parent as soon as the children are spawned, is that I need one intra-communicator over all processes. And as far as I know I cannot join the parent and children to one intra-communicator. 

The purpose of the whole exercise is that I want that my program to use all cores of a node by default when executed without mpirun.

I have tested this with OpenMPI 1.4.5. A sample program is here: http://pastebin.com/g2XSZwvY . "Child finalized" is only printed with the sleep(2) in the parent not commented out.


ORNL/UT Center for Molecular Biophysics cmb.ornl.gov
865-241-1537, ORNL PO BOX 2008 MS6309
users mailing list