On Mar 1, 2006, at 5:26 PM, Xiaoning (David) Yang wrote:
> I installed Open MPI 1.0.1 on two Mac G5s (one with two cpus and
> the other
> with 4 cpus.). I set up ssh on both machines according to the FAQ.
> My mpi
> jobs work fine if I run the jobs on only one computer. But when I
> ran a job
> across the two Macs from the first Mac mac1, I got:
> mac1: mpirun -np 6 --hostfiles /Users/me/my_hosts hello_world
> tcsh: orted: Command not found.
> [mac1:01019] ERROR: A daemon on node mac2 failed to start as expected.
> [mac1:01019] ERROR: There may be more information available from
> [mac1:01019] ERROR: the remote shell (see above).
> [mac1:01019] ERROR: The daemon exited unexpectedly with status 1.
> 2 processes killed (possibly by Open MPI)
> File my_hosts looks like
> mac1 slots=2
> mac2 slots=4
> The orted is definitely on my path on both machines. Any idea? Help is
> greatly appreciated!
I'm guessing that the issue is with your shell configuration. mpirun
starts the orted on the remote node through rsh/ssh, which will start
a non-login shell on the remote node. Unfortunately, the set of
dotfiles evaluated when a non-login shell is different than when
starting a login shell. The easiest way to tell if this is the issue
is to check whether orted is in your path when started in a non-login
shell with a command like:
ssh remote_host which orted
More information on how to configure your particular shell for use
with Open MPI can be found in our FAQ at:
Hope this helps,
Open MPI developer