Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Problem with mpi_comm_spawn_multiple
From: Andrew J Marquis (a.marquis_at_[hidden])
Date: 2010-05-05 15:18:16


Dear Ralph,

  thanks for that. I have done much the same (as I indicated in my original post). I this case my C-program correctly spawned the slaves and the slaves printed the correctly passed argument lists. On running this and my fortran slave I get:

 nsize, mytid: iargs 2 0 : 1
 spray: 0 1:1 2 3 4

 nsize, mytid: iargs 2 1 : 1
 spray: 1 1:5 6 7 8

which is what I expect.

I still think the error may well be mine rather that ompi's but I am at a loss to see what is going on !!

Thanks for the help so far,

Fred Marquis.

c-program
=========
#include "mpi.h"
#include <stdio.h>
#include <stdlib.h>

int main( int argc, char *argv[] )
{
    int np[2] = { 1, 1 };
    int errcodes[2];
    char *cmds[2] = { "./spray", "./spray" };
    char *args[2] = { "1 2 3 4", "5 6 7 8" };
    char **array_of_argv[2];
       char *argv0[] = {"1 2 3 4", (char *)0};
       char *argv1[] = {"5 6 7 8", (char *)0};
       array_of_argv[0] = argv0;
       array_of_argv[1] = argv1;

    MPI_Comm parentcomm, intercomm;
    MPI_Info infos[2] = { MPI_INFO_NULL, MPI_INFO_NULL };

    MPI_Init( &argc, &argv );
    MPI_Comm_spawn_multiple( 2, cmds, array_of_argv, np, infos,
                             0, MPI_COMM_WORLD, &intercomm, errcodes );
    MPI_Finalize();
    return 0;
}

On Wed, May 05, 2010 at 07:47:20PM +0100, Ralph Castain wrote:
> I think OMPI is okay - here is a C sample program and the associated output:
>
> $ mpirun -np 3 ./spawn_multiple
> Parent [pid 98895] about to spawn!
> Parent [pid 98896] about to spawn!
> Parent [pid 98897] about to spawn!
> Parent done with spawn
> Parent sending message to children
> Parent done with spawn
> Parent done with spawn
> Hello from the child 0 of 2 on host Ralph pid 98898: argv[1] = foo
> Child 0 received msg: 38
> Hello from the child 1 of 2 on host Ralph pid 98899: argv[1] = bar
> Parent disconnected
> Parent disconnected
> Child 1 disconnected
> Child 0 disconnected
> Parent disconnected
>

>
>
> On May 5, 2010, at 12:08 PM, Fred Marquis wrote:
>
> > Hi,
> >
> > I am using mpi_comm_spawn_multiple to spawn multiple commands with argument lists. I am trying to do this in fortran (77) using version openmpi-1.4.1 and the ifort compiler v9.0. The operating system is SuSE Linux 10.1 (x86-64).
> >
> > I have put together a simple controlling example program (test_pbload.F) and an example slave program (spray.F) to try and explain my problem.
> >
> > In the controlling program mpi_comm_spawn_multiple is used to set 2 copies of the slave running. The first is started with the argument list "1 2 3 4" and the second with "5 6 7 8".
> >
> > The slaves are started OK and the slaves print out the argument lists and exit. In addition the slaves print out their rank numbers so I can see which argument list belongs to which slave.
> >
> > What I am finding is that the argument lists are not being sent to the slaves correctly, indeed both slaves seem to be getting both arguments lists !!!
> >
> > To compile and run the programs I follow the steps below.
> >
> > Controlling program "test_pbload.F"
> >
> > mpif77 -o test_pbload test_pbload.F
> >
> > Slave program "spray.F"
> >
> > mpif77 -o spray spray.F
> >
> > Run the controller
> >
> > mpirun -np 1 test_pbload
> >
> >
> >
> >
> > The output of which is from the first slave:
> >
> > nsize, mytid: iargs 2 0 : 2
> > spray: 0 1:1 2 3 4 < FIRST ARGUMENT
> > spray: 0 2:4 5 6 7 < SECOND ARGUMENT
> >
> > and the second slave:
> >
> > nsize, mytid: iargs 2 1 : 2
> > spray: 1 1:1 2 3 4 < FIRST ARGUMENT
> > spray: 1 2:4 5 6 7 < SECOND ARGUMENT
> >
> > In each case the arguments (2 in both cases) are the same.
> >
> > I have written a C version of the controlling program and everthing works as expected so I presume that I have either got the specification of the argument list wrong or I have discovered an error/bug. At the moment I working on the former -- but am at a loss to see what is wrong !!
> >
> > Any help, pointers etc really appreciated.
> >
> >
> > Controlling program (that uses MPI_COMM_SPAWN_MULTIPLE) test_pbload.F
> >
> > program main
> > c
> > implicit none
> > #include "mpif.h"
> >
> > integer error
> > integer intercomm
> > CHARACTER*25 commands(2), argvs(2, 2)
> > integer nprocs(2),info(2),ncpus
> > c
> > call mpi_init(error)
> > c
> > ncpus = 2
> > c
> > commands(1) = ' ./spray '
> > nprocs(1) = 1
> > info(1) = MPI_INFO_NULL
> > argvs(1, 1) = ' 1 2 3 4 '
> > argvs(1, 2) = ' '
> > c
> > commands(2) = ' ./spray '
> > nprocs(2) = 1
> > info(2) = MPI_INFO_NULL
> > argvs(2, 1) = ' 4 5 6 7 '
> > argvs(2, 2) = ' '
> > c
> > call mpi_comm_spawn_multiple( ncpus,
> > 1 commands, argvs, nprocs, info,
> > 2 0, MPI_COMM_WORLD, intercomm,
> > 3 MPI_ERRCODES_IGNORE, error )
> > c
> > call mpi_finalize(error)
> > c
> > end
> >
> > Slave program (started by the controlling program) spray.F
> >
> > program main
> > integer error
> > integer pid
> > character*20 line(100)
> > call mpi_init(error)
> > c
> > CALL MPI_COMM_SIZE(MPI_COMM_WORLD,NSIZE,error)
> > CALL MPI_COMM_RANK(MPI_COMM_WORLD,MYTID,error)
> > c
> > iargs=iargc()
> > write(*,*) 'nsize, mytid: iargs', nsize, mytid, ":", iargs
> > c
> > if( iargs.gt.0 ) then
> > do i = 1, iargs
> > call getarg(i,line(i))
> > write(*,'(1x,a,i3,20(i2,1h:,a))')
> > 1 'spray: ',mytid,i,line(i)
> > enddo
> > endif
> > c
> > call mpi_finalize(error)
> > c
> > end
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>

> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
   ----------------------------------------------------------
   Dr. A.J. Marquis       Tel:      +44 (0)20 7594 7040
   Dept. of Mech. Eng.    Fax:      +44 (0)20 7594 1472
   Imperial College
   Exhibition Road        E-Mail:   a.marquis_at_[hidden]
   London   SW7 2AZ
   BOFH: Maintence window broken
   All views expressed are my own !
   ----------------------------------------------------------