Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Problem with mpi_comm_spawn_multiple
From: Ralph Castain (rhc_at_[hidden])
Date: 2010-05-05 14:47:20


I think OMPI is okay - here is a C sample program and the associated output:

$ mpirun -np 3 ./spawn_multiple
Parent [pid 98895] about to spawn!
Parent [pid 98896] about to spawn!
Parent [pid 98897] about to spawn!
Parent done with spawn
Parent sending message to children
Parent done with spawn
Parent done with spawn
Hello from the child 0 of 2 on host Ralph pid 98898: argv[1] = foo
Child 0 received msg: 38
Hello from the child 1 of 2 on host Ralph pid 98899: argv[1] = bar
Parent disconnected
Parent disconnected
Child 1 disconnected
Child 0 disconnected
Parent disconnected


On May 5, 2010, at 12:08 PM, Fred Marquis wrote:

> Hi,
>
> I am using mpi_comm_spawn_multiple to spawn multiple commands with argument lists. I am trying to do this in fortran (77) using version openmpi-1.4.1 and the ifort compiler v9.0. The operating system is SuSE Linux 10.1 (x86-64).
>
> I have put together a simple controlling example program (test_pbload.F) and an example slave program (spray.F) to try and explain my problem.
>
> In the controlling program mpi_comm_spawn_multiple is used to set 2 copies of the slave running. The first is started with the argument list "1 2 3 4" and the second with "5 6 7 8".
>
> The slaves are started OK and the slaves print out the argument lists and exit. In addition the slaves print out their rank numbers so I can see which argument list belongs to which slave.
>
> What I am finding is that the argument lists are not being sent to the slaves correctly, indeed both slaves seem to be getting both arguments lists !!!
>
> To compile and run the programs I follow the steps below.
>
> Controlling program "test_pbload.F"
>
> mpif77 -o test_pbload test_pbload.F
>
> Slave program "spray.F"
>
> mpif77 -o spray spray.F
>
> Run the controller
>
> mpirun -np 1 test_pbload
>
>
>
>
> The output of which is from the first slave:
>
> nsize, mytid: iargs 2 0 : 2
> spray: 0 1:1 2 3 4 < FIRST ARGUMENT
> spray: 0 2:4 5 6 7 < SECOND ARGUMENT
>
> and the second slave:
>
> nsize, mytid: iargs 2 1 : 2
> spray: 1 1:1 2 3 4 < FIRST ARGUMENT
> spray: 1 2:4 5 6 7 < SECOND ARGUMENT
>
> In each case the arguments (2 in both cases) are the same.
>
> I have written a C version of the controlling program and everthing works as expected so I presume that I have either got the specification of the argument list wrong or I have discovered an error/bug. At the moment I working on the former -- but am at a loss to see what is wrong !!
>
> Any help, pointers etc really appreciated.
>
>
> Controlling program (that uses MPI_COMM_SPAWN_MULTIPLE) test_pbload.F
>
> program main
> c
> implicit none
> #include "mpif.h"
>
> integer error
> integer intercomm
> CHARACTER*25 commands(2), argvs(2, 2)
> integer nprocs(2),info(2),ncpus
> c
> call mpi_init(error)
> c
> ncpus = 2
> c
> commands(1) = ' ./spray '
> nprocs(1) = 1
> info(1) = MPI_INFO_NULL
> argvs(1, 1) = ' 1 2 3 4 '
> argvs(1, 2) = ' '
> c
> commands(2) = ' ./spray '
> nprocs(2) = 1
> info(2) = MPI_INFO_NULL
> argvs(2, 1) = ' 4 5 6 7 '
> argvs(2, 2) = ' '
> c
> call mpi_comm_spawn_multiple( ncpus,
> 1 commands, argvs, nprocs, info,
> 2 0, MPI_COMM_WORLD, intercomm,
> 3 MPI_ERRCODES_IGNORE, error )
> c
> call mpi_finalize(error)
> c
> end
>
> Slave program (started by the controlling program) spray.F
>
> program main
> integer error
> integer pid
> character*20 line(100)
> call mpi_init(error)
> c
> CALL MPI_COMM_SIZE(MPI_COMM_WORLD,NSIZE,error)
> CALL MPI_COMM_RANK(MPI_COMM_WORLD,MYTID,error)
> c
> iargs=iargc()
> write(*,*) 'nsize, mytid: iargs', nsize, mytid, ":", iargs
> c
> if( iargs.gt.0 ) then
> do i = 1, iargs
> call getarg(i,line(i))
> write(*,'(1x,a,i3,20(i2,1h:,a))')
> 1 'spray: ',mytid,i,line(i)
> enddo
> endif
> c
> call mpi_finalize(error)
> c
> end
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users