One further point that I missed in my earlier note: if you are starting the parent as a singleton, then you are fooling yourself about the "without mpirun" comment. A singleton immediately starts a local daemon to act as mpirun so that comm_spawn will work. Otherwise, there is no way to launch the child processes.
So you might as well just launch the "child" job directly with mpirun - the result is exactly the same. If you truly want the job to use all the cores, one proc per core, and don't want to tell it the number of cores, then use the OMPI devel trunk where we have added support for such patterns. All you would have to do is:
mpirun -ppr 1:core --bind-to core ./my_app
and you are done.
On Jun 18, 2012, at 4:27 AM, TERRY DONTJE wrote:
> On 6/16/2012 8:03 AM, Roland Schulz wrote:
>> I would like to start a single process without mpirun and then use MPI_Comm_spawn to start up as many processes as required. I don't want the parent process to take up any resources, so I tried to disconnect the inter communicator and then finalize mpi and exit the parent. But as soon as I do that the children exit too. Why is that? Can I somehow change that behavior? Or can I wait on the children to exit without the waiting taking up CPU time?
>> The reason I don't need the parent as soon as the children are spawned, is that I need one intra-communicator over all processes. And as far as I know I cannot join the parent and children to one intra-communicator.
> You could use MPI_Intercomm_merge to create an intra-communicator out of the groups in an inter-communicator and pass the inter-communicator you get back from the MPI_Comm_spawn call.
>> The purpose of the whole exercise is that I want that my program to use all cores of a node by default when executed without mpirun.
>> I have tested this with OpenMPI 1.4.5. A sample program is here: http://pastebin.com/g2XSZwvY . "Child finalized" is only printed with the sleep(2) in the parent not commented out.
>> ORNL/UT Center for Molecular Biophysics cmb.ornl.gov
>> 865-241-1537, ORNL PO BOX 2008 MS6309
>> users mailing list
> Terry D. Dontje | Principal Software Engineer
> Developer Tools Engineering | +1.781.442.2631
> Oracle - Performance Technologies
> 95 Network Drive, Burlington, MA 01803
> Email terry.dontje_at_[hidden]
> users mailing list