Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Infiniband performance Problem and stalling
From: Yevgeny Kliteynik (kliteyn_at_[hidden])
Date: 2012-09-09 04:18:03


Randolph,

On 9/7/2012 7:43 AM, Randolph Pullen wrote:
> Yevgeny,
> The ibstat results:
> CA 'mthca0'
> CA type: MT25208 (MT23108 compat mode)

What you have is InfiniHost III HCA, which is 4x SDR card.
This card has theoretical peak of 10 Gb/s, which is 1GB/s in IB bit coding.

> And more interestingly, ib_write_bw:
> Conflicting CPU frequency values detected: 1600.000000 != 3301.000000
>
> What does Conflicting CPU frequency values mean?
>
> Examining the /proc/cpuinfo file however shows:
> processor : 0
> cpu MHz : 3301.000
> processor : 1
> cpu MHz : 3301.000
> processor : 2
> cpu MHz : 1600.000
> processor : 3
> cpu MHz : 1600.000
>
> Which seems oddly wierd to me...

You need to have all the cores running at highest clock to get better numbers.
May be you have power governor not set to optimal performance on these machines.
Google for "Linux CPU scaling governor" to get more info on this subject, or
contact your system admin and ask him to take care of the CPU frequencies.

Once this is done, check all the pairs of your machines - ensure that you get
a good number with ib_write_br.
Note that if you have a slower machine in the cluster, general application
performance will suffer from this.
 
> > On 8/31/2012 10:53 AM, Randolph Pullen wrote:
> > > (reposted with consolidatedinformation)
> > > I have a test rig comprising 2 i7 systems 8GB RAM with Melanox III HCA 10G cards
> > > running Centos 5.7 Kernel 2.6.18-274
> > > Open MPI 1.4.3
> > > MLNX_OFED_LINUX-1.5.3-1.0.0.2 (OFED-1.5.3-1.0.0.2):
> > > On a Cisco 24 pt switch
> > > Normal performance is:
> > > $ mpirun --mca btl openib,self -n 2 -hostfile mpi.hosts PingPong
> > > results in:
> > > Max rate = 958.388867 MB/sec Min latency = 4.529953 usec
> > > and:
> > > $ mpirun --mca btl tcp,self -n 2 -hostfile mpi.hosts PingPong
> > > Max rate = 653.547293 MB/sec Min latency = 19.550323 usec

These numbers look fine - 958 MB/s on IB is close to theoretical limit.
654 MB/s for IPoIB look fine too.

> > > My problem is I see better performance under IPoIB then I do on native IB (RDMA_CM).

I don't see this in your numbers. What do I miss?

> > > My understanding is that IPoIB is limited to about 1G/s so I am at a loss to know why it is faster.

Again, I see IPoIB performance under 1 GB/s.

> > > And this one produces similar run times but seems to degrade with repeated cycles:
> > > mpirun --mca btl_openib_eager_limit 64 --mca mpi_leave_pinned 1 --mca btl openib,self -H vh2,vh1 -np 9 --bycore prog
>
> You're running 9 ranks on two machines, but you're using IB for intra-node communication.
> Is it intentional? If not, you can add "sm" btl and have performance improved.

Also, don't forget to include "sm" btl if you have more than 1 MPI rank per node.

-- YK