hi,all:
I am a first user of openmpi, I have used mpich before.I found there
are many differenties between them.So I am confused.
I build openmpi on a ps3 using default option,that is
$ ./configure --prefiex=
$ make all install
I modify my .bash_profile file and add openmpi lib and
executable file
in LD_LIBRARY_PATH and PATH.
I use NFS file system between server and node, I just install
openmpi on
server.
I check the mailling list and FAQ, knowing default lancher is
ssh,but I
sitll add "pls_rsh_agent = ssh" in openmpi-mca-params.conf.
I test the hello_c.c example. when I run:
$mpiexec -host ps3-2 -n 4 ./hello
it can run correctly(ps3-2 is hostname of server).I try it on
each node.
but when I run:
$ mpiexec -hostfile host.txt -n 4 ./hello
content of host.txt:
ps3-1
ps3-2
there is error message:
bash: orted: command not found
[ps3-1:25154] [0,0,0] ORTE_ERROR_LOG: Timeout in file
base/pls_base_orted_cmds.c at line 275
[ps3-1:25154] [0,0,0] ORTE_ERROR_LOG: Timeout in file
pls_rsh_module.c
at line 1164
[ps3-1:25154] [0,0,0] ORTE_ERROR_LOG: Timeout in file
errmgr_hnp.c at
line 90
[ps3-1:25154] ERROR: A daemon on node ps3-2 failed to start as
expected.
[ps3-1:25154] ERROR: There may be more information available
from
[ps3-1:25154] ERROR: the remote shell (see above).
[ps3-1:25154] ERROR: The daemon exited unexpectedly with status
127.
[ps3-1:25154] [0,0,0] ORTE_ERROR_LOG: Timeout in file
base/pls_base_orted_cmds.c at line 188
[ps3-1:25154] [0,0,0] ORTE_ERROR_LOG: Timeout in file
pls_rsh_module.c
at line 1196
--------------------------------------------------------------------------
mpiexec was unable to cleanly terminate the daemons for this
job.
Returned value Timeout instead of ORTE_SUCCESS.
--------------------------------------------------------------------------
I search the same problem in mailing list and FAQ, saying PATH
and
LD_LIBRARY_PATH are not setted correctly,but I ensure them in my
path.
I use openmpi in first time, so hope anybody help me,thanks a
lot!
--
Li, Chanjuan Lanzhou University
Distributed & Embedded System Lab http://dslab.lzu.edu.cn
School of Information Science and Engeneering lichanjuan04_at_[hidden]
Tianshui South Road 222. Lanzhou 730000 .P.R.China
Tel:+86-931-8912025 Fax:+86-931-8912022
|