Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] orted runs on host, but doesn't seem to communicate
From: jody (jody.xha_at_[hidden])
Date: 2008-06-11 10:13:14


Hi
Since i upgraded from open-mpi .2.2 to open-mpi 1.2.5 and
had to reinstall my machine aim-plankton (fedora 8 instead of fedora 6)
open-mpi doesn't work correctly anymore:

When i start an application from aim-plankton to run on an other machine,
it seems to hang (no output, not even from debug-daemons).
[jody_at_aim-plankton ~] $ mpirun -np 1 --debug-daemons --host
aim-nano1.uzh.ch MPITest

However, this action causes the creation of an orted process on the
other machine:
[jody_at_aim-nano1 ~] $ ps ax | grep orted
 7680 ? Ss 0:00 /opt/openmpi/bin/orted --bootproxy 1 --name
0.0.1 --num_procs 2 --vpid_start 0 --nodename aim-nano1.uzh.ch
--universe jody_at_[hidden]:default-universe-9422 --nsreplica
0.0.0;tcp://130.60.126.111:60229 --gprreplica
0.0.0;tcp://130.60.126.111:60229
 7772 pts/0 S+ 0:00 grep --colour=auto orted

The other way round it works without problems
[jody_at_aim-nano1 ~] $ mpirun -np 1 --debug-daemons --host aim-plankton MPITest
Daemon [0,0,1] checking in as pid 9759 on host aim-plankton
[aim-plankton.uzh.ch:09759] [0,0,1] orted: received launch callback
[aim-plankton.uzh.ch]I am #0/1 global
[aim-plankton.uzh.ch:09759] [0,0,1] orted_recv_pls: received message
from [0,0,0]
[aim-plankton.uzh.ch:09759] [0,0,1] orted_recv_pls: received exit

ssh works perfectly in both directions, with or without login.

My open-mpi setup worked well before. What happened since
was an upgrade to open-mpi 1.2.5 on all machines, and
a change from fedora 6 to fedora 8 on aim-plankton, the badly behaving machine.

Does anybody have a suggestion where i should start looking?

Thank you
  Jody