Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Fails to run "MPI_Comm_spawn" on remote host
From: Ralph Castain (rhc_at_[hidden])
Date: 2009-09-16 08:47:25


Good to hear! I'll update the man page as it should have included that info.

On Tue, Sep 15, 2009 at 9:48 PM, Jaison Paul <jmulerik_at_[hidden]> wrote:

> Hi Ralph,
>
> Thank you so much for your reply. Your tips worked! The idea is to set the
> hosts first and then pick them using 'host' reserved key in MPI_info. Great!
> Thanks a ton. I tried "-host" variable in mpirun like:
> "mpirun --prefix /opt/mpi/ompi-1.3.2/ -np 1 -host myhost1,myhost2
> spawner"
>
> and set
>
> "MPI_info" reserved key 'host' to set the remote host like:
>
> MPI_Info hostinfo;
> MPI_Info_create(&hostinfo);
> MPI_Info_set(hostinfo, "host", "myhost2");
> MPI_Info_set(hostinfo, "wdir",
> "/home/jaison/mpi/advanced_MPI/spawn/lib");
>
> Now I can run child processes in remote host - myhost2. I shall also try
> the "add-hostfile" option.
>
> Btw, the man page of MPI_Comm_spawn does not give detailed information as
> you have just done.
>
> Jaison
> http://cs.anu.edu.au/~Jaison.Mulerikkal/Home.html>
>
>
>
>
> On 16/09/2009, at 12:39 PM, Ralph Castain wrote:
>
> We don't support the ability to add a new host during a comm_spawn call in
> the 1.3 series. This is a feature that is being added for the upcoming new
> feature series release (tagged 1.5).
>
> There are two solutions to this problem in 1.3:
>
> 1. declare all hosts at the beginning of the job. You can then specify
> which one to use with the "host" key.
>
> 2. you -can- add a hostfile to the job during a comm_spawn. This is done
> with the "add-hostfile" key. All the hosts in the hostfile will be added to
> the job. You can then specify which host(s) to use for this particular
> comm_spawn with the "host" key.
>
> All of this is documented - you should see it with a "man MPI_Comm_spawn"
> command.
>
> If you need to dynamically add a host via "host" before then, you could try
> downloading a copy of the developer's trunk from the OMPI web site. It is
> implemented there at this time - and also documented via the man page.
>
> Ralph
>
>
> On Tue, Sep 15, 2009 at 5:14 PM, Jaison Paul <jmulerik_at_[hidden]>wrote:
>
>> Hi All,
>> I am waiting on some inputs on my query. I just wanted to know whether I
>> can run dynamic child processes using 'MPI_Comm_spawn' on remote hosts? (in
>> openmpi 1.3.2)). Has anyone did that successfully? Or OpenMPI hasnt
>> implemented it yet?
>>
>> Please help.
>>
>> Jaison
>>
http://cs.anu.edu.au/~Jaison.Mulerikkal/Home.html>
>>
>>
>>
>>
>> On 14/09/2009, at 8:45 AM, Jaison Paul wrote:
>>
>> Hi,
>>
>> I am trying to create a library using OpenMPI for an SOA middleware for my
>> Phd research. "MPI_Comm_spawn" is the one I need to go for. I got a
>> sample example working, but only on the local host. Whenever I try to run
>> the spawned children on a remote hosts, parent cannot launch children on
>> remote hosts and I get the following error message:
>>
>> ------------------BEGIN MPIRUN AND ERROR MSG------------------------
>> mpirun --prefix /opt/mpi/ompi-1.3.2/ --mca btl_tcp_if_include eth0 -np 1
>> /home/jaison/mpi/advanced_MPI/spawn/manager
>> Manager code started - host headnode -- myid & world_size 0 1
>> Host is: myhost
>> WorkDir is: /home/jaison/mpi/advanced_MPI/spawn/lib
>> --------------------------------------------------------------------------
>> There are no allocated resources for the application
>> /home/jaison/mpi/advanced_MPI/spawn//lib
>> that match the requested mapping:
>>
>>
>> Verify that you have mapped the allocated resources properly using the
>> --host or --hostfile specification.
>> --------------------------------------------------------------------------
>> --------------------------------------------------------------------------
>> A daemon (pid unknown) died unexpectedly on signal 1 while attempting to
>> launch so we are aborting.
>>
>> There may be more information reported by the environment (see above).
>>
>> This may be because the daemon was unable to find all the needed shared
>> libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
>> location of the shared libraries on the remote nodes and this will
>> automatically be forwarded to the remote nodes.
>> --------------------------------------------------------------------------
>> mpirun: clean termination accomplished
>> --------------------------END OF ERROR
>> MSG-----------------------------------
>>
>> I use the reserved keys - 'host' & 'wdir' - to set the remote host and
>> work directory using MPI_Info. Here is the code snippet:
>>
>> --------------------------BEGIN Code
>> Snippet-----------------------------------
>> MPI_Info hostinfo;
>> MPI_Info_create(&hostinfo);
>> MPI_Info_set(hostinfo, "host", "myhost");
>> MPI_Info_set(hostinfo, "wdir",
>> "/home/jaison/mpi/advanced_MPI/spawn/lib");
>>
>> // Checking for 'hostinfo'. The results are okay (see above)
>> int test0 = MPI_Info_get(hostinfo, "host", valuelen, value, &flag);
>> int test = MPI_Info_get(hostinfo, "wdir", valuelen, value1, &flag);
>> printf("Host is: %s\n", value);
>> printf("WorkDir is: %s\n", value1);
>>
>> sprintf( launched_program, "launched_program" );
>>
>> MPI_Comm_spawn( launched_program, MPI_ARGV_NULL , number_to_spawn,
>> hostinfo, 0, MPI_COMM_SELF, &everyone,
>> MPI_ERRCODES_IGNORE );
>>
>> --------------------------END OF Code
>> Snippet-----------------------------------
>>
>> I've set the LD_LIBRARY_PATH correctly. Is "MPI_Comm_spawn" implemented in
>> open mpi (I am using version 1.3.2)? If so, where am I going wrong? Any
>> input will be very much appreciated.
>>
>> Thanking you in advance.
>>
>> Jaison
>> jmulerik_at_[hidden]
>>
http://cs.anu.edu.au/~Jaison.Mulerikkal/Home.html>
>>
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>>
http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>