On Wed, Jan 17, 2007 at 08:55:31AM -0700, Brian W. Barrett wrote:
>On Jan 17, 2007, at 2:39 AM, Gleb Natapov wrote:
>> On Wed, Jan 17, 2007 at 04:12:10AM -0500, Robin Humble wrote:
>>> basically I'm seeing wildly different bandwidths over InfiniBand 4x DDR
>>> when I use different kernels.
>> Try to load ib_mthca with tune_pci=1 option on those kernels that are
>> slow.
>when an application has high buffer reuse (like NetPIPE), which can
>be enabled by adding "-mca mpi_leave_pinned 1" to the mpirun command
>line.
thanks! :-)
tune_pci=1 makes a huge difference at the top end, and
-mca mpi_leave_pinned 1 adds lots of midrange bandwidth.
latencies (~4us) and the low end performance are all unchanged.
see attached for details.
most curves are for 2.6.19.2 except the last couple (tagged as old)
which are for 2.6.9-42.0.3.ELsmp and for which tune_pci changes nothing.
why isn't tune_pci=1 the default I wonder?
files in /sys/module/ib_mthca/ tell me it's off by default in
2.6.9-42.0.3.ELsmp, but the results imply that it's on... maybe PCIe
handling is very different in that kernel.
is ~10Gbit the best I can expect from 4x DDR IB with MPI?
some docs @HP suggest up to 16Gbit (data rate) should be possible, and
I've heard that 13 or 14 has been achieved before. but those might be
verbs numbers, or maybe horsepower >> 4 cores of 2.66GHz core2 is
required?
>It would be interesting to know if the bandwidth differences appear
>when the leave pinned protocol is used. My guess is that they will
yeah, it definitely makes a difference in the 10kB to 10mB range.
at around 100kB there's 2x the bandwidth when using pinned.
thanks again!
> Brian Barrett
> Open MPI Team, CCS-1
> Los Alamos National Laboratory
how's OpenMPI on Cell? :)
cheers,
robin
|