I am getting an error (details follow) in the simplest of the possible test scenarios:
Two identical regular Dell PCs connected back-to-back via an ethernet switch on the 10/100 ethernet. Both run Fedora Core 4. Identical version (1.1) of Open MPI is compiled and installed on both of them *without* a --prefix option (
i.e. installed on the default location of /usr/local).
The hostfile on both the machine is the same:
I can run openMPI on either of these two machines by forking two processes:
mpirun -np2 osu_acc_latency <------ This runs fine on either of the two machines.
However, when I try to luch the same program across the two machines, I get an error:
mpirun --hostfile ~/hostfile -np2 /home/durga/openmpi-1.1/osu_benchmarks/osu_acc_latency
/home/durga/openmpi-1.1/osu_benchmarks/osu_acc_latency: error while loading shared libraries: libmpi.so.0: cannot open shared object file: No such file or directory.
However, the file *does exist* in /usr/local/lib:
ls -l /usr/local/lib/libmpi.so.0
libmpi.so.0 -> libmpi.so.0.0.0
I have also tried adding /usr/local/lib to my LD_LIBRARY_PATH on *both* machines, to no avail.