Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] QLogic Infiniband : Run switch from ib0 to eth0
From: Thierry LAMOUREUX (thierry.lamoureux_at_[hidden])
Date: 2011-03-10 14:30:19


Hello,

We add recently enhanced our network with Infiniband modules on a six node
cluster.

We have install all OFED drivers related to our hardware

We have set network IP like following :
- eth : 192.168.1.0 / 255.255.255.0
- ib : 192.168.70.0 / 255.255.255.0

After first tests all seems good. IB interfaces ping each other, ssh and
other king of exchanges over IB works well.

Then we started to run our job thought openmpi (building with --with-openib
option) and our first results were very bad.

After investigations, our system have the following behaviour :
- job starts over ib network (few packet are sent)
- job switch to eth network (all next packet sent to these interfaces)

We never specified the IP Address of our eth interfaces.

We tried to launch our jobs with the following options :
- mpirun -hostfile hostfile.list -mca blt openib,self
/common_gfs2/script-test.sh
- mpirun -hostfile hostfile.list -mca blt openib,sm,self
/common_gfs2/script-test.sh
- mpirun -hostfile hostfile.list -mca blt openib,self -mca
btl_tcp_if_exclude lo,eth0,eth1,eth2 /common_gfs2/script-test.sh

The final behaviour remain the same : job is initiated over ib and runs over
eth.

We grab performance tests file (osu_bw and osu_latency) and we got not so
bad results (see attached files).

We had tried plenty of different things but we are stuck : we don't have any
error message...

Thanks per advance for your help.

Thierry.