Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Simple MPI_Comm_spawn program hangs
From: Prakash Velayutham (prakash.velayutham_at_[hidden])
Date: 2007-12-06 00:08:56


Hi Edgar,

I changed the spawned program from /bin/hostname to a very simple MPI
program as below. But now, the slave hangs right at MPI_Init line.
What could the issue be?

slave.c

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include "mpi.h"
#include <sys/types.h> /* standard system types */
#include <netinet/in.h> /* Internet address structures */
#include <sys/socket.h> /* socket interface functions */
#include <netdb.h> /* host to IP resolution */

int gdb_var;
void
main(int argc, char **argv)
{
         int tag = 0;
         int my_rank;
         int num_proc;
         MPI_Status status;
         MPI_Comm inter_comm;

        gdb_var = 0;
   char hostname[64];

    FILE *f;

         while (0 == gdb_var) sleep(5);
   gethostname(hostname, 64);

         MPI_Init(&argc, &argv);
         MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
         MPI_Comm_size(MPI_COMM_WORLD, &num_proc);

         MPI_Comm_get_parent(&inter_comm);

         MPI_Finalize();
         exit(0);
}

Thanks,
Prakash

On Dec 2, 2007, at 8:36 PM, Edgar Gabriel wrote:

> MPI_Comm_spawn is tested nightly by the test our suites, so it should
> definitely work...
>
> Thanks
> Edgar
>
> Prakash Velayutham wrote:
>> Thanks Edgar. I did not know that. Really?
>>
>> Anyways, you are sure, an MPI job will work as a spawned process
>> instead of "hostname"?
>>
>> Thanks,
>> Prakash
>>
>>
>> On Dec 1, 2007, at 5:56 PM, Edgar Gabriel wrote:
>>
>>> MPI_Comm_spawn has to build an intercommunicator with the child
>>> process
>>> that it spawns. Thus, you can not spawn a non-MPI job such as
>>> /bin/hostname, since the parent process waits for some messages from
>>> the
>>> child process(es) in order to set up the intercommunicator.
>>>
>>> Thanks
>>> Edgar
>>>
>>> Prakash Velayutham wrote:
>>>> Hello,
>>>>
>>>> Open MPI 1.2.4
>>>>
>>>> I am trying to run a simple C program.
>>>>
>>>> ######################################################################################
>>>>
>>>> #include <string.h>
>>>> #include <stdlib.h>
>>>> #include <stdio.h>
>>>> #include "mpi.h"
>>>>
>>>> void
>>>> main(int argc, char **argv)
>>>> {
>>>>
>>>> int tag = 0;
>>>> int my_rank;
>>>> int num_proc;
>>>> char message_0[] = "hello slave, i'm your
>>>> master";
>>>> char message_1[50];
>>>> char master_data[] = "slaves to work";
>>>> int array_of_errcodes[10];
>>>> int num;
>>>> MPI_Status status;
>>>> MPI_Comm inter_comm;
>>>> MPI_Info info;
>>>> int arr[1];
>>>> int rc1;
>>>>
>>>> MPI_Init(&argc, &argv);
>>>> MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
>>>> MPI_Comm_size(MPI_COMM_WORLD, &num_proc);
>>>>
>>>> printf("MASTER : spawning a slave ... \n");
>>>> rc1 = MPI_Comm_spawn("/bin/hostname", MPI_ARGV_NULL, 1,
>>>> MPI_INFO_NULL, 0, MPI_COMM_WORLD, &inter_comm, arr);
>>>>
>>>> MPI_Finalize();
>>>> exit(0);
>>>> }
>>>>
>>>> ######################################################################################
>>>>
>>>>
>>>> This program hangs as below:
>>>>
>>>> prakash_at_bmi-xeon1-01:~/thesis/CS/Samples> ./master1
>>>> MASTER : spawning a slave ...
>>>> bmi-xeon1-01
>>>>
>>>> Any ideas why?
>>>>
>>>> Thanks,
>>>> Prakash
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> --
>>> Edgar Gabriel
>>> Assistant Professor
>>> Parallel Software Technologies Lab http://pstl.cs.uh.edu
>>> Department of Computer Science University of Houston
>>> Philip G. Hoffman Hall, Room 524 Houston, TX-77204, USA
>>> Tel: +1 (713) 743-3857 Fax: +1 (713) 743-3335
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> --
> Edgar Gabriel
> Assistant Professor
> Parallel Software Technologies Lab http://pstl.cs.uh.edu
> Department of Computer Science University of Houston
> Philip G. Hoffman Hall, Room 524 Houston, TX-77204, USA
> Tel: +1 (713) 743-3857 Fax: +1 (713) 743-3335
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users