I have three machines: mine (daviddoria) and two identical remote machines
(cloud3 and cloud6). I can password-less ssh between any pair. The machines
are all 32bit running Fedora 11. OpenMPI was installed identically on each.
The .bashrc is identical on each. /etc/hosts is identical on each.
I wrote a test "hello world" program to ensure OpenMPI is behaving
The output is exactly as expected, each node seems to be alive.
[doriad_at_daviddoria MPITest]$ mpirun -H cloud6,daviddoria,cloud3 -np 3
Process 1 on daviddoria out of 3
Process 2 on cloud3 out of 3
Process 0 on cloud6 out of 3
I am trying to get a parallel application called Paraview working with these
three machines. Paraview is installed identically on each. As a test, I
wanted to get it working with two at a time first.
With cloud3, everything goes smoothly, that is, I tell Paraview to start the
ssh cloud3 mpirun -H cloud3 pvserver
and to connect to the server on cloud3, and I get the following (expected)
Listen on port: 11111
Waiting for client...
When I try the same thing on cloud6, it again goes smoothly
(I tell Paraview to start the server with
ssh cloud6 mpirun -H cloud6 pvserver
and connect to the server on cloud6)
Now for the real test...
I tell Paraview to start the server with
ssh cloud6 mpirun -H cloud6,cloud3 -np 2 pvserver
and connect to the server on cloud6
This again connects successfully. However, if I do the reverse:
ssh cloud3 mpirun -H cloud3,cloud6 -np 2 pvserver
and connect to the server on cloud3
it tries and tries for 60 seconds but it can't connect. I just see a bunch
of "Failed to connect to server on cloud3" errors.
Does anyone have any idea what could cause this asymmetric behavior?