Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] Can't use tcp instead of openib/infinipath
From: Bill Broadley (bill_at_[hidden])
Date: 2008-07-19 07:06:57


I built openib-1.2.6 on centos-5.2 with gcc-4.3.1.

I did a tar xvzf, cd openib-1.2.6, mkdir obj, cd obj:
(I put gcc-4.3.1/bin first in my path)
../configure --prefix=/opt/pkg/openmpi-1.2.6 --enable-shared --enable-debug

If I look in config.log I see:
MCA_btl_ALL_COMPONENTS=' self sm gm mvapi mx openib portals tcp udapl'
MCA_btl_DSO_COMPONENTS=' self sm openib tcp'

So both openib and tcp are available and have many parameters under
ompi_info --param btl tcp
ompi_info --param btl openib

Yet, when I run a MPI program I can't get use TCP:
# which mpirun
/opt/pkg/openmpi-1.2.6/bin/mpirun
# mpirun -mca btl ^openib -np 2 -machinefile m ./relay 1
compute-0-1.local compute-0-0.local
size= 1, 131072 hops, 2 nodes in 0.304 sec ( 2.320 us/hop) 1683 KB/sec

Or if I try the inverse:
# mpirun -mca btl self,tcp -np 2 -machinefile m ./relay 1
compute-0-1.local compute-0-0.local
size= 1, 131072 hops, 2 nodes in 0.313 sec ( 2.386 us/hop) 1637 KB/sec

2.3us is definitely faster than GigE. I don't have IPoverIB setup, ifconfig
-a shows ib0, but it has no IP address.

I removed all other openib implementations (infinipath came with one) before I
compiled, and the binary seems to be linked against the right libraries:
# ldd ./relay
        libmpi.so.0 => /opt/pkg/openmpi-1.2.6/lib/libmpi.so.0 (0x00002aaaaacc7000)
        libopen-rte.so.0 => /opt/pkg/openmpi-1.2.6/lib/libopen-rte.so.0
(0x00002aaaaafb5000)
        libopen-pal.so.0 => /opt/pkg/openmpi-1.2.6/lib/libopen-pal.so.0
(0x00002aaaab23d000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00002aaaab4b2000)
        libnsl.so.1 => /lib64/libnsl.so.1 (0x00002aaaab6b6000)
        libutil.so.1 => /lib64/libutil.so.1 (0x00002aaaab8ce000)
        libm.so.6 => /lib64/libm.so.6 (0x00002aaaabad2000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00002aaaabd55000)
        libc.so.6 => /lib64/libc.so.6 (0x00002aaaabf6f000)
        /lib64/ld-linux-x86-64.so.2 (0x00002aaaaaaab000)

Can anyone suggest what to look into?