Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: [OMPI users] orted runs on host, but doesn't seem to communicate
From: jody (jody.xha_at_[hidden])
Date: 2008-06-17 10:44:07


Hi
Since i upgraded from open-mpi .2.2 to open-mpi 1.2.5 and
had to reinstall my machine aim-plankton (fedora 8 instead of fedora 6)
open-mpi doesn't work correctly anymore:

When i start an application from aim-plankton to run on an other machine,
it seems to hang (no output, not even from debug-daemons).
[jody_at_aim-plankton ~] $ mpirun -np 1 --debug-daemons --host
aim-nano1.uzh.ch MPITest

However, this action causes the creation of an orted process on the
other machine:
[jody_at_aim-nano1 ~] $ ps ax | grep orted
 7680 ? Ss 0:00 /opt/openmpi/bin/orted --bootproxy 1 --name
0.0.1 --num_procs 2 --vpid_start 0 --nodename aim-nano1.uzh.ch
--universe jody_at_[hidden]:default-universe-9422 --nsreplica
0.0.0;tcp://130.60.126.111:60229 --gprreplica
0.0.0;tcp://130.60.126.111:60229
 7772 pts/0 S+ 0:00 grep --colour=auto orted

The other way round it works without problems
[jody_at_aim-nano1 ~] $ mpirun -np 1 --debug-daemons --host aim-plankton MPITest
Daemon [0,0,1] checking in as pid 9759 on host aim-plankton
[aim-plankton.uzh.ch:09759] [0,0,1] orted: received launch callback
[aim-plankton.uzh.ch]I am #0/1 global
[aim-plankton.uzh.ch:09759] [0,0,1] orted_recv_pls: received message
from [0,0,0]
[aim-plankton.uzh.ch:09759] [0,0,1] orted_recv_pls: received exit

ssh works perfectly in both directions, with or without login.

My open-mpi setup worked well before. What happened since
was an upgrade to open-mpi 1.2.5 on all machines, and
a change from fedora 6 to fedora 8 on aim-plankton, the badly behaving machine.

Does anybody have a suggestion where i should start looking?

Thank you
 Jody