Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Very poor performance with btl sm on twin nehalem servers with Mellanox Technologies MT26428 (ConnectX)
From: Oskar Enoksson (enok_at_[hidden])
Date: 2010-05-11 15:50:53

Sorry, the kernel is, not And I forgot to mention
the system is CentOS 5.4.

And further ... 25MB/s is after tweaking btl_sm_num_fifos=8 and
btl_sm_eager_limit=65536. Without those the rate is 9MB/s for 1MB
packets and 1.5MB/s for 10kB packets :-(

On 05/11/2010 08:19 PM, Oskar Enoksson wrote:
> I have a cluster with two Intel Xeon Nehalem E5520 CPU per server
> (quad-core, 2.27GHz). The interconnect is 4xQDR Infiniband (Mellanox
> ConnectX).
> I have compiled and installed OpenMPI 1.4.2. The kernel is and
> I have compiled the kernel myself. I use gridengine 6.2u5. Openmpi was
> compiled with "--with-libnuma --with-sge.
> The problem is that I get very bad performance unless I explicitly
> exclude the "sm" btl and I can't figure out why. I have tried searching
> the web and the OpenMPI mailing lists. I have seen reports about
> non-optimal performance, but my results are far worse than any other
> reports I have found.
> I run the "mpi_stress" program with different packet lengths. I run on a
> single server using 8 slots so that all eight cores on one server are
> occupied.
> When I use "-mca btl self,openib" I get pretty good results, between
> 450MB/s and 700MB/s depending on the packet lengths. When I use "-mca
> btl self,sm" or "-mca btl self,sm,openib" I just get 25MB/s to 30MB/s
> for packet length 1MB. For 10kB packets the results are around 5MB/s.
> things get abour 20% faster if I set "-mca paffinity_alone 1".
> What is going on? Any hints? I thought these CPU's had excellent
> SM-bandwidth over quickpath. I expected several GB/s.
> Hyperthreading is enabled, if that is relevant. The locked-memory limit
> is 500MB and the stack limit is 64MB.
> Please help!
> Thanks
> /Oskar