Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Brian W. Barrett (bbarrett_at_[hidden])
Date: 2007-01-17 10:55:31


On Jan 17, 2007, at 2:39 AM, Gleb Natapov wrote:

> Hi Robin,
>
> On Wed, Jan 17, 2007 at 04:12:10AM -0500, Robin Humble wrote:
>>
>> so this isn't really an OpenMPI questions (I don't think), but you
>> guys
>> will have hit the problem if anyone has...
>>
>> basically I'm seeing wildly different bandwidths over InfiniBand
>> 4x DDR
>> when I use different kernels.
>> I'm testing with netpipe-3.6.2's NPmpi, but a home-grown pingpong
>> sees
>> the same thing.
>>
>> the default 2.6.9-42.0.3.ELsmp (and also sles10's kernel) gives ok
>> bandwidth (50% of peak I guess is good?) at ~10 Gbit/s, but a pile of
>> newer kernels (2.16.19.2, 2.6.20-rc4,
>> 2.6.18-1.2732.4.2.el5.OFED_1_1(*))
>> all max out at ~5.3 Gbit/s.
>>
>> half the bandwidth! :-(
>> latency is the same.
> Try to load ib_mthca with tune_pci=1 option on those kernels that are
> slow.

I can't speak to the kernels, but one note about bandwidth. By
default, Open MPI uses a pipelined pinning protocol for large message
transfer that provides the best bandwidth when the application has
low buffer reuse and does not require the use of intercepts in the
malloc library or using mallopt to prevent libc from returning memory
to the OS. We have another mode that provides much better bandwidth
when an application has high buffer reuse (like NetPIPE), which can
be enabled by adding "-mca mpi_leave_pinned 1" to the mpirun command
line.

It would be interesting to know if the bandwidth differences appear
when the leave pinned protocol is used. My guess is that they will
not, but one never knows. If so, there are a couple of different
possibilities for why there is the slowdown: higher memory pinning
times, an interaction that throws off our pipeline, etc...

Brian

-- 
   Brian Barrett
   Open MPI Team, CCS-1
   Los Alamos National Laboratory