Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Can't start program across network
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2009-03-13 16:27:12


On Mar 13, 2009, at 6:17 AM, Raymond Wan wrote:

> What doesn't work is:
>
> [On Y] mpirun --host Y,Z --np 2 uname -a
> [On Y] mpirun --host X,Y,Z --np 3 uname -a
>
> ...and similarly for machine Z. I can confirm that from any of the
> 3 machines, I can ssh to the other without typing in a password. I
> set up the RSA keys correctly [I think]. When I run the above
> commands, it just hangs. Adding "--verbose" doesn't produce any
> information...I don't know what it's doing. I had a longer running
> program than "uname" and I didn't see it appear on any of the
> machines. In fact [since it hangs], I don't see uname on "top",
> either. I do, however, see "mpirun" and "orted" on top, though.
>
> I guess some setup is missing that X has that the other two do not
> have. Any suggestions on how to find out the cause of this
> problem? Thank you!
>

Do you see "rsh" or "ssh" in the output of "ps -eadf" when mpirun is
hanging, perchance? If you, what happens if you copy-n-paste those
command lines and run them manually?

-- 
Jeff Squyres
Cisco Systems