Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Robin Humble (rjh+openmpi_at_[hidden])
Date: 2007-01-18 07:17:13

On Thu, Jan 18, 2007 at 11:08:04AM +0200, Gleb Natapov wrote:
>On Thu, Jan 18, 2007 at 03:52:19AM -0500, Robin Humble wrote:
>> On Wed, Jan 17, 2007 at 08:55:31AM -0700, Brian W. Barrett wrote:
>> >On Jan 17, 2007, at 2:39 AM, Gleb Natapov wrote:
>> >> On Wed, Jan 17, 2007 at 04:12:10AM -0500, Robin Humble wrote:
>> >>> basically I'm seeing wildly different bandwidths over InfiniBand 4x DDR
>> >>> when I use different kernels.
>> >> Try to load ib_mthca with tune_pci=1 option on those kernels that are
>> >> slow.
>> >when an application has high buffer reuse (like NetPIPE), which can
>> >be enabled by adding "-mca mpi_leave_pinned 1" to the mpirun command
>> >line.
>> thanks! :-)
>> tune_pci=1 makes a huge difference at the top end, and
>Well this is broken BIOS then. Look here for more explanation:
>search for "tune_pci=1".

ok. thanks :-/

>> -mca mpi_leave_pinned 1 adds lots of midrange bandwidth.
>> latencies (~4us) and the low end performance are all unchanged.
>> see attached for details.
>> most curves are for except the last couple (tagged as old)
>> which are for 2.6.9-42.0.3.ELsmp and for which tune_pci changes nothing.
>> why isn't tune_pci=1 the default I wonder?
>> files in /sys/module/ib_mthca/ tell me it's off by default in
>> 2.6.9-42.0.3.ELsmp, but the results imply that it's on... maybe PCIe
>> handling is very different in that kernel.
>This is explained in the link above.

but (sorry to harp on about this) /sys/module/ib_mthca/tune_pci is 0
for 2.6.9-42.0.3.ELsmp.
and even if that's lying, then mthca_tune_pci() appears identically
invoked in mthca_main.c from both 2.6.9-42.0.3.ELsmp and
mthca_main.c is the only place in infiniband/hw/mthca that
pci_write_config_word() is called from, so you'd think that's got to be
how PCIe for IB was setup.

basically it's not clear to me how or if tune_pci is being set in
2.6.9-42.0.3.ELsmp, nor why it's any different to :-/

maybe it's some other level in the kernel setting up PCIe differently?
but that would presumably be unrelated to OFED.

is there a way to check pci burst settings from userland? or BIOS?

BTW, the card appears to be Voltaire and system is SGI xe (210 and 240)
if that helps. /sys/class/infiniband/mthca0/board_id is VLT0050010001
not that I'm blaming anyone! :-)