Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Problem with mpi_comm_spawn_multiple
From: Ralph Castain (rhc_at_[hidden])
Date: 2010-05-05 15:41:42


Ah, missed that - afraid I no speakee fortran any more (thankfully got to remove that module from my brain 20+ years ago).

On May 5, 2010, at 1:18 PM, Andrew J Marquis wrote:

> Dear Ralph,
>
> thanks for that. I have done much the same (as I indicated in my original post). I this case my C-program correctly spawned the slaves and the slaves printed the correctly passed argument lists. On running this and my fortran slave I get:
>
> nsize, mytid: iargs 2 0 : 1
> spray: 0 1:1 2 3 4
>
> nsize, mytid: iargs 2 1 : 1
> spray: 1 1:5 6 7 8
>
>
> which is what I expect.
>
> I still think the error may well be mine rather that ompi's but I am at a loss to see what is going on !!
>
> Thanks for the help so far,
>
> Fred Marquis.
>
>
> c-program
> =========
> #include "mpi.h"
> #include <stdio.h>
> #include <stdlib.h>
>
> int main( int argc, char *argv[] )
> {
> int np[2] = { 1, 1 };
> int errcodes[2];
> char *cmds[2] = { "./spray", "./spray" };
> char *args[2] = { "1 2 3 4", "5 6 7 8" };
> char **array_of_argv[2];
> char *argv0[] = {"1 2 3 4", (char *)0};
> char *argv1[] = {"5 6 7 8", (char *)0};
> array_of_argv[0] = argv0;
> array_of_argv[1] = argv1;
>
>
> MPI_Comm parentcomm, intercomm;
> MPI_Info infos[2] = { MPI_INFO_NULL, MPI_INFO_NULL };
>
> MPI_Init( &argc, &argv );
> MPI_Comm_spawn_multiple( 2, cmds, array_of_argv, np, infos,
> 0, MPI_COMM_WORLD, &intercomm, errcodes );
> MPI_Finalize();
> return 0;
> }
>
> On Wed, May 05, 2010 at 07:47:20PM +0100, Ralph Castain wrote:
>> I think OMPI is okay - here is a C sample program and the associated output:
>>
>> $ mpirun -np 3 ./spawn_multiple
>> Parent [pid 98895] about to spawn!
>> Parent [pid 98896] about to spawn!
>> Parent [pid 98897] about to spawn!
>> Parent done with spawn
>> Parent sending message to children
>> Parent done with spawn
>> Parent done with spawn
>> Hello from the child 0 of 2 on host Ralph pid 98898: argv[1] = foo
>> Child 0 received msg: 38
>> Hello from the child 1 of 2 on host Ralph pid 98899: argv[1] = bar
>> Parent disconnected
>> Parent disconnected
>> Child 1 disconnected
>> Child 0 disconnected
>> Parent disconnected
>>
>
>
>>
>>
>> On May 5, 2010, at 12:08 PM, Fred Marquis wrote:
>>
>>> Hi,
>>>
>>> I am using mpi_comm_spawn_multiple to spawn multiple commands with argument lists. I am trying to do this in fortran (77) using version openmpi-1.4.1 and the ifort compiler v9.0. The operating system is SuSE Linux 10.1 (x86-64).
>>>
>>> I have put together a simple controlling example program (test_pbload.F) and an example slave program (spray.F) to try and explain my problem.
>>>
>>> In the controlling program mpi_comm_spawn_multiple is used to set 2 copies of the slave running. The first is started with the argument list "1 2 3 4" and the second with "5 6 7 8".
>>>
>>> The slaves are started OK and the slaves print out the argument lists and exit. In addition the slaves print out their rank numbers so I can see which argument list belongs to which slave.
>>>
>>> What I am finding is that the argument lists are not being sent to the slaves correctly, indeed both slaves seem to be getting both arguments lists !!!
>>>
>>> To compile and run the programs I follow the steps below.
>>>
>>> Controlling program "test_pbload.F"
>>>
>>> mpif77 -o test_pbload test_pbload.F
>>>
>>> Slave program "spray.F"
>>>
>>> mpif77 -o spray spray.F
>>>
>>> Run the controller
>>>
>>> mpirun -np 1 test_pbload
>>>
>>>
>>>
>>>
>>> The output of which is from the first slave:
>>>
>>> nsize, mytid: iargs 2 0 : 2
>>> spray: 0 1:1 2 3 4 < FIRST ARGUMENT
>>> spray: 0 2:4 5 6 7 < SECOND ARGUMENT
>>>
>>> and the second slave:
>>>
>>> nsize, mytid: iargs 2 1 : 2
>>> spray: 1 1:1 2 3 4 < FIRST ARGUMENT
>>> spray: 1 2:4 5 6 7 < SECOND ARGUMENT
>>>
>>> In each case the arguments (2 in both cases) are the same.
>>>
>>> I have written a C version of the controlling program and everthing works as expected so I presume that I have either got the specification of the argument list wrong or I have discovered an error/bug. At the moment I working on the former -- but am at a loss to see what is wrong !!
>>>
>>> Any help, pointers etc really appreciated.
>>>
>>>
>>> Controlling program (that uses MPI_COMM_SPAWN_MULTIPLE) test_pbload.F
>>>
>>> program main
>>> c
>>> implicit none
>>> #include "mpif.h"
>>>
>>> integer error
>>> integer intercomm
>>> CHARACTER*25 commands(2), argvs(2, 2)
>>> integer nprocs(2),info(2),ncpus
>>> c
>>> call mpi_init(error)
>>> c
>>> ncpus = 2
>>> c
>>> commands(1) = ' ./spray '
>>> nprocs(1) = 1
>>> info(1) = MPI_INFO_NULL
>>> argvs(1, 1) = ' 1 2 3 4 '
>>> argvs(1, 2) = ' '
>>> c
>>> commands(2) = ' ./spray '
>>> nprocs(2) = 1
>>> info(2) = MPI_INFO_NULL
>>> argvs(2, 1) = ' 4 5 6 7 '
>>> argvs(2, 2) = ' '
>>> c
>>> call mpi_comm_spawn_multiple( ncpus,
>>> 1 commands, argvs, nprocs, info,
>>> 2 0, MPI_COMM_WORLD, intercomm,
>>> 3 MPI_ERRCODES_IGNORE, error )
>>> c
>>> call mpi_finalize(error)
>>> c
>>> end
>>>
>>> Slave program (started by the controlling program) spray.F
>>>
>>> program main
>>> integer error
>>> integer pid
>>> character*20 line(100)
>>> call mpi_init(error)
>>> c
>>> CALL MPI_COMM_SIZE(MPI_COMM_WORLD,NSIZE,error)
>>> CALL MPI_COMM_RANK(MPI_COMM_WORLD,MYTID,error)
>>> c
>>> iargs=iargc()
>>> write(*,*) 'nsize, mytid: iargs', nsize, mytid, ":", iargs
>>> c
>>> if( iargs.gt.0 ) then
>>> do i = 1, iargs
>>> call getarg(i,line(i))
>>> write(*,'(1x,a,i3,20(i2,1h:,a))')
>>> 1 'spray: ',mytid,i,line(i)
>>> enddo
>>> endif
>>> c
>>> call mpi_finalize(error)
>>> c
>>> end
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> ----------------------------------------------------------
> Dr. A.J. Marquis Tel: +44 (0)20 7594 7040
> Dept. of Mech. Eng. Fax: +44 (0)20 7594 1472
> Imperial College
> Exhibition Road E-Mail: a.marquis_at_[hidden]
> London SW7 2AZ
>
> BOFH: Maintence window broken
>
> All views expressed are my own !
> ----------------------------------------------------------
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users