On 25 October 2007 at 07:54, Jeff Squyres wrote:
| We will not dlopen libibverbs.so directly -- we will only dlopen the
| mca_btl_openib.so file. The dynamic linker will automatically open
| all of its dependencies. If those dependencies cannot be found /
| symbols cannot be resolved, the dynamic linker will fail the dlopen
| of libibverbs.
|
| Can you run "ldd mca_btl_openib.so" on your head node and your
| compute nodes? See if there's a difference in the output. I think
| this is the next step in this troubleshooting process...
Sure, good idea.
head and build machine:
$ ldd /usr/lib/openmpi/mca_btl_openib.so
linux-gate.so.1 => (0xffffe000)
libibverbs.so.1 => /usr/lib/libibverbs.so.1 (0xb7f42000)
libpthread.so.0 => /lib/libpthread.so.0 (0xb7f2b000)
libmpi.so.0 => /usr/lib/libmpi.so.0 (0xb7ea6000)
libopen-rte.so.0 => /usr/lib/libopen-rte.so.0 (0xb7e52000)
libopen-pal.so.0 => /usr/lib/libopen-pal.so.0 (0xb7dfb000)
libdl.so.2 => /lib/libdl.so.2 (0xb7df7000)
libnsl.so.1 => /lib/libnsl.so.1 (0xb7de1000)
libutil.so.1 => /lib/libutil.so.1 (0xb7ddd000)
libm.so.6 => /lib/libm.so.6 (0xb7db7000)
libc.so.6 => /lib/libc.so.6 (0xb7c8a000)
/lib/ld-linux.so.2 (0x80000000)
compute node:
$ ldd /usr/lib/openmpi/mca_btl_openib.so
/usr/lib/openmpi/mca_btl_openib.so: /usr/lib/libibverbs.so.1: version `IBVERBS_1.1' not found (required by /usr/lib/openmpi/mca_btl_openib.so)
linux-gate.so.1 => (0xffffe000)
libibverbs.so.1 => /usr/lib/libibverbs.so.1 (0xb7ee6000)
libpthread.so.0 => /lib/tls/i686/cmov/libpthread.so.0 (0xb7ecf000)
libmpi.so.0 => /usr/lib/libmpi.so.0 (0xb7e4a000)
libopen-rte.so.0 => /usr/lib/libopen-rte.so.0 (0xb7df6000)
libopen-pal.so.0 => /usr/lib/libopen-pal.so.0 (0xb7d9f000)
libdl.so.2 => /lib/tls/i686/cmov/libdl.so.2 (0xb7d9b000)
libnsl.so.1 => /lib/tls/i686/cmov/libnsl.so.1 (0xb7d84000)
libutil.so.1 => /lib/tls/i686/cmov/libutil.so.1 (0xb7d80000)
libm.so.6 => /lib/tls/i686/cmov/libm.so.6 (0xb7d58000)
libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb7c17000)
libsysfs.so.2 => /lib/libsysfs.so.2 (0xb7c0c000)
/lib/ld-linux.so.2 (0x80000000)
Bingo!! And I am being found with my package install being inconsistent. Tst tst.
I *think* this may be due to the fact that at one point before "we" (as in
the few folks looking after the .deb for Open MPI) had learned about the 'btl
^openib' option and I had become so disenchanted with the 'noisy' message
that I hacked libibverbs. That may explain the head-node. Let me get that
one back to the pristine Ubuntu / Debian package, and then to possibly
rebuild the Open MPI package there to correct depends going.
Thanks so much for your help and patience on this.
Dirk
--
Three out of two people have difficulties with fractions.
|