Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Two remote machines - asymmetric behavior
From: Tomislav Maric (tomislav.maric_at_[hidden])
Date: 2009-08-03 09:21:51


David Doria wrote:
> I have three machines: mine (daviddoria) and two identical remote
> machines (cloud3 and cloud6). I can password-less ssh between any pair.
> The machines are all 32bit running Fedora 11. OpenMPI was installed
> identically on each. The .bashrc is identical on each. /etc/hosts is
> identical on each.
>
>
> I wrote a test "hello world" program to ensure OpenMPI is behaving
> correctly.
>
>
> The output is exactly as expected, each node seems to be alive.
>
>
> [doriad_at_daviddoria MPITest]$ mpirun -H cloud6,daviddoria,cloud3 -np 3
> hello-mpi
> Process 1 on daviddoria out of 3
> Process 2 on cloud3 out of 3
> Process 0 on cloud6 out of 3
>
>
> I am trying to get a parallel application called Paraview working with
> these three machines. Paraview is installed identically on each. As a
> test, I wanted to get it working with two at a time first.
>
>
> With cloud3, everything goes smoothly, that is, I tell Paraview to start
> the server with
>
> ssh cloud3 mpirun -H cloud3 pvserver
>
> and to connect to the server on cloud3, and I get the following
> (expected) output:
>
>
> Listen on port: 11111
>
> Waiting for client...
>
> Client connected.
>
>
> When I try the same thing on cloud6, it again goes smoothly
>
> (I tell Paraview to start the server with
>
> ssh cloud6 mpirun -H cloud6 pvserver
>
> and connect to the server on cloud6)
>
>
> Now for the real test...
>
> I tell Paraview to start the server with
>
> ssh cloud6 mpirun -H cloud6,cloud3 -np 2 pvserver
>
> and connect to the server on cloud6
>
>
> This again connects successfully. However, if I do the reverse:
>
>
> ssh cloud3 mpirun -H cloud3,cloud6 -np 2 pvserver
>
> and connect to the server on cloud3
>
>
> it tries and tries for 60 seconds but it can't connect. I just see a
> bunch of "Failed to connect to server on cloud3" errors.
>
>
> Does anyone have any idea what could cause this asymmetric behavior?
>
>
> Thanks,
>
> David
>

I'm a newbie, so forgive me if I ask something stupid:

why are You running ssh command before mpirun command? I'm interested in
setting up a paraview server on a LAN to pos-tprocess OpenFOAM
simulation data.

Just a total newbish comment: doesn't the mpirun in fact call for the
ssh anyway? And if pvserver is to be run on multiple machines and is
programmed in Open MPI shouldn't

mpirun -np procNumber -H host1,host2,host3 pvserver

be enough to get it going, as well as any other parallel program? Again,
please excuse my newbiness.

Best regards,

Tomislav