Scott Shaw wrote:
> Hi, I hope this is the right forum for my questions. I am running into
> a problem when scaling >512 cores on a infiniband cluster which has
> 14,336 cores. I am new to openmpi and trying to figure out the right
> -mca options to pass to avoid the "mca_oob_tcp_peer_complete_connect:
> connection failed:" on a cluster which has infiniband HCAs and OFED
> v1.3GA release. Other MPI implementation like Intel MPI and mvapich
> work fine using uDAPL or VERBs IB layers for MPI communications.
Did you have chance to see this FAQ -
> I find it difficult to understand which network interface or IB layer
> being used. When I explicitly state not to use eth0,lo,ib1, or ib1:0
> interfaces with the cmdline option "-mca oob_tcp_exclude" openmpi will
> continue to probe these interfaces. For all MPI traffic openmpi should
> use IB0 which is the 10.148 network. But with debugging enabled I see
> references trying the 10.149 network which is IB1. Below is the
> ifconfig network device output for a compute node.
> 1. Is there away to determine which network device is being used and not
> have openmpi fallback to another device? With Intel MPI or HP MPI you
> can state not to use a fallback device. I thought "-mca
> oob_tcp_exclude" would be the correct option to pass but I maybe wrong.
If you want to use the IB verbs , you may specify:
-mca btl sm.self,openib
sm - shmem
self - self comunication
openib - IB communication (IB verbs)
> 2. How can I determine infiniband openib device is actually being used?
> When running a MPI app I continue to see counters for in/out packets at
> a tcp level increasing when it should be using the IB RDMA device for
> all MPI comms over the IB0 or mtcha0 device? OpenMPI was bundled with
> OFED v1.3 so I am assuming the openib interface should work. Running
> ompi_info shows btl_open_* references.
> /usr/mpi/openmpi-1.2-2/intel/bin/mpiexec -mca
> btl_openib_warn_default_gid_prefix 0 -mca oob_tcp_exclude
> eth0,lo,ib1,ib1:0 -mca btl openib,sm,self -machinefile mpd.hosts.$$ -np
> 1024 ~/bin/test_ompi < input1