I am trying to get OpenMPI running on my home network. This has two
machines, t61 and quad, both running SuSE 11. I'm using the "hello_c"
program from the examples as a test. It will run fine on each machine,
using whatever number or processes I specify. However, when I try to
run on multiple machines, it hangs.
If I start from t61 with the command "mpiexec -host t61,quad -np 2 hello"
then I see that command when I do a ps -ax on t61. On quad I see
"orted --daemonize (long parameter string)". Both of them seem to be
silently waiting on some event, but I've no idea what.
Both machines are running OpenMPI 1.4.2 (compiled from same tar file),
installed in /opt/openmpi. The executables are in the same user/path
on each machine (/home/me/src/openmpi/examples), and path,
LD_LIBRARY_PATH, and so on all seem the same.
PS: Also, may I suggest putting something in the FAQ pointing out
that the environment vars need to be set in .tcshrc, not .login?
It would have saved me several hours.