Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Jorge Parra (jeparra_at_[hidden])
Date: 2007-10-29 13:27:43


When running openMPI my system freezes when initializing MPI (function
MPI_init). This happens only when I try to run the process in multiples
nodes in my cluster. Running multiple instances of the testing code
locally (i.e ./mpirun -np 2 greetings) is succesful.

- rsh runs well, and is configured to full access. (i.e. rsh
" date" is succesful, so they are "rsh AFRLMPPBM2 date" or
"rsh"). Security is not an issue in this system.

- uname -n and hostname return a valid hostname

- The testing code (attached to this email) is run (and fails) as:
./mpirun --hostfile /root/hostfile -np 2 greetings . The hostfile has the
names of the localnode (first entry:AFRLMPPBM1) and the remote node
(second entry: AFRLMPPBM2). This file is also attached to this email.

- The environment variables seem to be properly set (see env.log attached
file). Local mpi programs (i.e. ./mpirun -np 2 greetings) run well.

-.profile has the path information for both the executables and the

- orted runs in the remote node, however it does not print anything in
console. The only output in the remote node is:

pam_rhosts_auth[235]: user root has a `+' user entry
pam_rhosts_auth[235]: allowed to root_at_[hidden] as root
PAM_unix[235]: (rsh) session opened for user root by (uid=0)
in.rshd[236]: root_at_[hidden] as root: cmd='( ! [ -e
./.profile ]
|| . ./.profile; orted --bootproxy 1 --name 0.0.1 --num_procs 3
--vpid_start 0 -
-nodename --universe
04 --nsreplica "0.0.0;tcp://" --gprreplica
68.1.102:32824" --mpi-call-yield 0 )'
PAM_unix[235]: (rsh) session closed for user root

Then the remote process returns command prompt. However orted is in the
background. The local process is frozen, and just prints: "Calling init",
which is just before MPI_Init (see greetings.c).

I believe the COMM WORLD cannot be correctly initialized. However I can't
see which part of my configuration is wrong.

Any help is greatly appreciated.

Thank you,