Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Edgar Gabriel (gabriel_at_[hidden])
Date: 2006-03-02 14:49:11

so for my tests, Open MPI did follow the machinefile (see output)
further below, however, for each spawn operation it starts from the very
beginning of the machinefile...

The following example spawns 5 child processes (with a single
MPI_Comm_spawn), and each child prints its rank and the hostname.

gabriel_at_linux12 ~/dyncomm $ mpirun -hostfile machinefile -np 3
  Checking for MPI_Comm_spawn.....................working
Hello world from child 0 on host linux12
Hello world from child 1 on host linux13
Hello world from child 3 on host linux15
Hello world from child 4 on host linux16
      Testing Send/Recv on the intercomm..........working
Hello world from child 2 on host linux14

with the machinefile being:
gabriel_at_linux12 ~/dyncomm $ cat machinefile

In your code, you always spawn 1 process at the time, and that's why
they are all located on the same node.

Hope this helps...

Edgar Gabriel wrote:

> as far as I know, Open MPI should follow the machinefile for spawn
> operations, starting however for every spawn at the beginning of the
> machinefile again. An info object such as 'lam_sched_round_robin' is
> currently not available/implemented. Let me look into this...
> Jean Latour wrote:
>>Testing the MPI_Comm_Spawn function of Open MPI version 1.0.1, I have an
>>example that works OK,
>>except that it shows that the spawned processes do not follow the
>>"machinefile" setting of processors.
>>In this example a master process spawns first 2 processes, then
>>disconnects from them and spawn 2 more
>>processes. Running on a Quad Opteron node, all processes are running on
>>the same node, although the
>>machinefile specifies that the slaves should run on different nodes.
>>With the actual version of OpenMPI is it possible to direct the spawned
>>processes on
>>a specific node ? (the node distribution could be given in the
>>"machinefile" file, as with LAM MPI)
>>The code (Fortran 90) of this example and makefile is attached as a tar
>>Thank you very much
>>Jean Latour
>>users mailing list
> _______________________________________________
> users mailing list
> users_at_[hidden]

Edgar Gabriel
Assistant Professor
Department of Computer Science          email:gabriel_at_[hidden]
University of Houston         
Philip G. Hoffman Hall, Room 524        Tel: +1 (713) 743-3857
Houston, TX-77204, USA                  Fax: +1 (713) 743-3335