Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Fails to run "MPI_Comm_spawn" on remote host
From: Jaison Paul (jmulerik_at_[hidden])
Date: 2009-09-15 23:48:52


Hi Ralph,

Thank you so much for your reply. Your tips worked! The idea is to
set the hosts first and then pick them using 'host' reserved key in
MPI_info. Great! Thanks a ton. I tried "-host" variable in mpirun like:

  "mpirun --prefix /opt/mpi/ompi-1.3.2/ -np 1 -host myhost1,myhost2
spawner"

and set

"MPI_info" reserved key 'host' to set the remote host like:

   MPI_Info hostinfo;
   MPI_Info_create(&hostinfo);
   MPI_Info_set(hostinfo, "host", "myhost2");
   MPI_Info_set(hostinfo, "wdir", "/home/jaison/mpi/advanced_MPI/
spawn/lib");

Now I can run child processes in remote host - myhost2. I shall also
try the "add-hostfile" option.

Btw, the man page of MPI_Comm_spawn does not give detailed
information as you have just done.

Jaison
http://cs.anu.edu.au/~Jaison.Mulerikkal/Home.html

On 16/09/2009, at 12:39 PM, Ralph Castain wrote:

> We don't support the ability to add a new host during a comm_spawn
> call in the 1.3 series. This is a feature that is being added for
> the upcoming new feature series release (tagged 1.5).
>
> There are two solutions to this problem in 1.3:
>
> 1. declare all hosts at the beginning of the job. You can then
> specify which one to use with the "host" key.
>
> 2. you -can- add a hostfile to the job during a comm_spawn. This is
> done with the "add-hostfile" key. All the hosts in the hostfile
> will be added to the job. You can then specify which host(s) to use
> for this particular comm_spawn with the "host" key.
>
> All of this is documented - you should see it with a "man
> MPI_Comm_spawn" command.
>
> If you need to dynamically add a host via "host" before then, you
> could try downloading a copy of the developer's trunk from the OMPI
> web site. It is implemented there at this time - and also
> documented via the man page.
>
> Ralph
>
>
> On Tue, Sep 15, 2009 at 5:14 PM, Jaison Paul
> <jmulerik_at_[hidden]> wrote:
> Hi All,
>
> I am waiting on some inputs on my query. I just wanted to know
> whether I can run dynamic child processes using 'MPI_Comm_spawn' on
> remote hosts? (in openmpi 1.3.2)). Has anyone did that
> successfully? Or OpenMPI hasnt implemented it yet?
>
> Please help.
>
> Jaison
> http://cs.anu.edu.au/~Jaison.Mulerikkal/Home.html
>
>
>
>
> On 14/09/2009, at 8:45 AM, Jaison Paul wrote:
>
>> Hi,
>>
>> I am trying to create a library using OpenMPI for an SOA
>> middleware for my Phd research. "MPI_Comm_spawn" is the one I
>> need to go for. I got a sample example working, but only on the
>> local host. Whenever I try to run the spawned children on a
>> remote hosts, parent cannot launch children on remote hosts and I
>> get the following error message:
>>
>> ------------------BEGIN MPIRUN AND ERROR MSG------------------------
>> mpirun --prefix /opt/mpi/ompi-1.3.2/ --mca btl_tcp_if_include eth0
>> -np 1 /home/jaison/mpi/advanced_MPI/spawn/manager
>> Manager code started - host headnode -- myid & world_size 0 1
>> Host is: myhost
>> WorkDir is: /home/jaison/mpi/advanced_MPI/spawn/lib
>> ---------------------------------------------------------------------
>> -----
>> There are no allocated resources for the application
>> /home/jaison/mpi/advanced_MPI/spawn//lib
>> that match the requested mapping:
>>
>>
>> Verify that you have mapped the allocated resources properly using
>> the
>> --host or --hostfile specification.
>> ---------------------------------------------------------------------
>> -----
>> ---------------------------------------------------------------------
>> -----
>> A daemon (pid unknown) died unexpectedly on signal 1 while
>> attempting to
>> launch so we are aborting.
>>
>> There may be more information reported by the environment (see
>> above).
>>
>> This may be because the daemon was unable to find all the needed
>> shared
>> libraries on the remote node. You may set your LD_LIBRARY_PATH to
>> have the
>> location of the shared libraries on the remote nodes and this will
>> automatically be forwarded to the remote nodes.
>> ---------------------------------------------------------------------
>> -----
>> mpirun: clean termination accomplished
>> --------------------------END OF ERROR
>> MSG-----------------------------------
>>
>> I use the reserved keys - 'host' & 'wdir' - to set the remote host
>> and work directory using MPI_Info. Here is the code snippet:
>>
>> --------------------------BEGIN Code
>> Snippet-----------------------------------
>> MPI_Info hostinfo;
>> MPI_Info_create(&hostinfo);
>> MPI_Info_set(hostinfo, "host", "myhost");
>> MPI_Info_set(hostinfo, "wdir", "/home/jaison/mpi/advanced_MPI/
>> spawn/lib");
>>
>> // Checking for 'hostinfo'. The results are okay (see above)
>> int test0 = MPI_Info_get(hostinfo, "host", valuelen, value, &flag);
>> int test = MPI_Info_get(hostinfo, "wdir", valuelen, value1, &flag);
>> printf("Host is: %s\n", value);
>> printf("WorkDir is: %s\n", value1);
>>
>> sprintf( launched_program, "launched_program" );
>>
>> MPI_Comm_spawn( launched_program, MPI_ARGV_NULL , number_to_spawn,
>> hostinfo, 0, MPI_COMM_SELF, &everyone,
>> MPI_ERRCODES_IGNORE );
>>
>> --------------------------END OF Code
>> Snippet-----------------------------------
>>
>> I've set the LD_LIBRARY_PATH correctly. Is "MPI_Comm_spawn"
>> implemented in open mpi (I am using version 1.3.2)? If so, where
>> am I going wrong? Any input will be very much appreciated.
>>
>> Thanking you in advance.
>>
>> Jaison
>> jmulerik_at_[hidden]
>> http://cs.anu.edu.au/~Jaison.Mulerikkal/Home.html
>>
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users