Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] avoid usage of ssh on local machine
From: Ralph Castain (rhc_at_[hidden])
Date: 2008-11-14 12:07:04


Hello

OMPI doesn't use ssh by default to launch a daemon local to mpirun -
instead, we locally fork/exec the orted.

The problem here is that OMPI doesn't realize that you are launching
on the local machine. This is usually caused by confusion when IP
resolving the hostname returned by gethostname vs. the IP address on
your machine.

Take a look at ifconfig and see what addresses are on your machine. Do
any of them match the IP address OMPI is trying to launch to?

Ralph

On Nov 14, 2008, at 5:27 AM, Sun, Yongqi (E F ES EN 72) wrote:

> Hello,
>
> I have two questions about ssh and details follow.
>
> Questions:
>
> Is there any way to prevent the usage of ssh on my local desktop and
> launch locally by default? (The FAQ page writes "Also note that if
> using
> a launcher that uses a hostfile and no hostfile is specified, all
> processes are launched on the local host." Unfortunately, this is not
> the case for me. )
>
> If ssh/rsh has to be used, can I redirect the host to local machine?
> (I
> have tried to add "192.168.160.1" to /etc/hosts, but nothing
> changed.) I
> want to use OpenMPI in Eclipse, where "--hostfile" option cannot be
> added to mpirun.
>
> Details:
>
> I'm using OpenMPI 1.2.8 on my Linux desktop (two quad-core AMD Opteron
> 2354). Although I always launch mpirun only on the local machine,
> ssh is
> used by the default case. For example,
> shell% cd [openmpi-1.2.8]/examples
>
> The code can be compiled (so IMHO the PATH and LD_LIBRARY_PATH are
> correct)
> shell% mpicc -o hello_c hello_c.c
>
> But when lauched
> shell% mpirun -np 2 hello_c
>
> There are runtime errors:
>
> ssh: connect to host 192.168.160.1 port 22: No route to host
> [W71c-140644:14261] [0,0,0] ORTE_ERROR_LOG: Timeout in file
> base/pls_base_orted_cmds.c at line 275
> [W71c-140644:14261] [0,0,0] ORTE_ERROR_LOG: Timeout in file
> pls_rsh_module.c at line 1158
> [W71c-140644:14261] [0,0,0] ORTE_ERROR_LOG: Timeout in file
> errmgr_hnp.c
> at line 90
> [W71c-140644:14261] ERROR: A daemon on node 192.168.160.1 failed to
> start as expected.
> [W71c-140644:14261] ERROR: There may be more information available
> from
> [W71c-140644:14261] ERROR: the remote shell (see above).
> [W71c-140644:14261] ERROR: The daemon exited unexpectedly with status
> 255.
> [W71c-140644:14261] [0,0,0] ORTE_ERROR_LOG: Timeout in file
> base/pls_base_orted_cmds.c at line 188
> [W71c-140644:14261] [0,0,0] ORTE_ERROR_LOG: Timeout in file
> pls_rsh_module.c at line 1190
> ------------------------------------------------------------------------
> --
> mpirun was unable to cleanly terminate the daemons for this job.
> Returned value Timeout instead of ORTE_SUCCESS.
> ------------------------------------------------------------------------
> --
> <<ompi-output.tar.gz>>
>
> However, I'm lauching on my local desktop, where no "192.168.160.1"
> exists. I have to specify a hostfile to make it working as expected
> shell% mpirun -np 2 --hostfile myhostfile hello_c
>
> Where the "myhostfile" contains my local machine "W71C-140644"
>
> Best wishes
>
> Sun, Yongqi
> <ompi-output.tar.gz>_______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users