I'm having a problem running mpirun and I was wondering if there are suggestions on how to find out the cause. I have 3 machines that I can use: X, Y, and Z. The important thing is that X is different from Y and Z (the software installed, version of Linux, etc. Y and Z are identical software installations.)
All of this works:
[On X] mpirun --host Y,Z --np 2 uname -a
[On X] mpirun --host X,Y,Z --np 3 uname -a
[On Y] mpirun --host Y --np 2 uname -a
(and likewise, other combinations)
What doesn't work is:
[On Y] mpirun --host Y,Z --np 2 uname -a
[On Y] mpirun --host X,Y,Z --np 3 uname -a
...and similarly for machine Z. I can confirm that from any of the 3 machines, I can ssh to the other without typing in a password. I set up the RSA keys correctly [I think]. When I run the above commands, it just hangs. Adding "--verbose" doesn't produce any information...I don't know what it's doing. I had a longer running program than "uname" and I didn't see it appear on any of the machines. In fact [since it hangs], I don't see uname on "top", either. I do, however, see "mpirun" and "orted" on top, though.
I guess some setup is missing that X has that the other two do not have. Any suggestions on how to find out the cause of this problem? Thank you!
PS: It has been a long time since I got X working...I might have done something that I no longer remember; but I don't remember seeing this problem before.