Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] Two remote machines - asymmetric behavior
From: David Doria (daviddoria+openmpi_at_[hidden])
Date: 2009-08-03 08:59:15


I have three machines: mine (daviddoria) and two identical remote machines
(cloud3 and cloud6). I can password-less ssh between any pair. The machines
are all 32bit running Fedora 11. OpenMPI was installed identically on each.
The .bashrc is identical on each. /etc/hosts is identical on each.

I wrote a test "hello world" program to ensure OpenMPI is behaving
correctly.

The output is exactly as expected, each node seems to be alive.

[doriad_at_daviddoria MPITest]$ mpirun -H cloud6,daviddoria,cloud3 -np 3
hello-mpi
Process 1 on daviddoria out of 3
Process 2 on cloud3 out of 3
Process 0 on cloud6 out of 3

I am trying to get a parallel application called Paraview working with these
three machines. Paraview is installed identically on each. As a test, I
wanted to get it working with two at a time first.

With cloud3, everything goes smoothly, that is, I tell Paraview to start the
server with

ssh cloud3 mpirun -H cloud3 pvserver

and to connect to the server on cloud3, and I get the following (expected)
output:

Listen on port: 11111

 Waiting for client...

 Client connected.

When I try the same thing on cloud6, it again goes smoothly

(I tell Paraview to start the server with

ssh cloud6 mpirun -H cloud6 pvserver

and connect to the server on cloud6)

Now for the real test...

I tell Paraview to start the server with

ssh cloud6 mpirun -H cloud6,cloud3 -np 2 pvserver

and connect to the server on cloud6

This again connects successfully. However, if I do the reverse:

ssh cloud3 mpirun -H cloud3,cloud6 -np 2 pvserver

and connect to the server on cloud3

 it tries and tries for 60 seconds but it can't connect. I just see a bunch
of "Failed to connect to server on cloud3" errors.

Does anyone have any idea what could cause this asymmetric behavior?

Thanks,

David