Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Brian Barrett (brbarret_at_[hidden])
Date: 2006-03-02 00:24:27


On Mar 1, 2006, at 5:26 PM, Xiaoning (David) Yang wrote:

> I installed Open MPI 1.0.1 on two Mac G5s (one with two cpus and
> the other
> with 4 cpus.). I set up ssh on both machines according to the FAQ.
> My mpi
> jobs work fine if I run the jobs on only one computer. But when I
> ran a job
> across the two Macs from the first Mac mac1, I got:
>
> mac1: mpirun -np 6 --hostfiles /Users/me/my_hosts hello_world
> tcsh: orted: Command not found.
> [mac1:01019] ERROR: A daemon on node mac2 failed to start as expected.
> [mac1:01019] ERROR: There may be more information available from
> [mac1:01019] ERROR: the remote shell (see above).
> [mac1:01019] ERROR: The daemon exited unexpectedly with status 1.
> 2 processes killed (possibly by Open MPI)
>
> File my_hosts looks like
>
> mac1 slots=2
> mac2 slots=4
>
> The orted is definitely on my path on both machines. Any idea? Help is
> greatly appreciated!

I'm guessing that the issue is with your shell configuration. mpirun
starts the orted on the remote node through rsh/ssh, which will start
a non-login shell on the remote node. Unfortunately, the set of
dotfiles evaluated when a non-login shell is different than when
starting a login shell. The easiest way to tell if this is the issue
is to check whether orted is in your path when started in a non-login
shell with a command like:

   ssh remote_host which orted

More information on how to configure your particular shell for use
with Open MPI can be found in our FAQ at:

   http://www.open-mpi.org/faq/?category=running

Hope this helps,

Brian

-- 
   Brian Barrett
   Open MPI developer
   http://www.open-mpi.org/