Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Robin Humble (rjh+openmpi_at_[hidden])
Date: 2007-01-19 02:54:33


On Thu, Jan 18, 2007 at 03:10:15PM +0200, Gleb Natapov wrote:
>On Thu, Jan 18, 2007 at 07:17:13AM -0500, Robin Humble wrote:
>> On Thu, Jan 18, 2007 at 11:08:04AM +0200, Gleb Natapov wrote:
>> >On Thu, Jan 18, 2007 at 03:52:19AM -0500, Robin Humble wrote:
>> >> On Wed, Jan 17, 2007 at 08:55:31AM -0700, Brian W. Barrett wrote:
>> >> >On Jan 17, 2007, at 2:39 AM, Gleb Natapov wrote:
>> >> >> On Wed, Jan 17, 2007 at 04:12:10AM -0500, Robin Humble wrote:
>> >> >>> basically I'm seeing wildly different bandwidths over InfiniBand 4x DDR
>> >> >>> when I use different kernels.
>> >> >> Try to load ib_mthca with tune_pci=1 option on those kernels that are
>> >> >> slow.
>...
>> >> tune_pci=1 makes a huge difference at the top end, and
>> >Well this is broken BIOS then. Look here for more explanation:
>> >https://staging.openfabrics.org/svn/openib/gen2/branches/1.1/ofed/docs/mthca_release_notes.txt
>> >search for "tune_pci=1".
>> ok. thanks :-/
>...
>BIOS should configure MaxReadReq to maximum value supported by chipset.
>Linux shouldn't touch this value at all.

thanks. I'm told there's a bug already open with our vendor on this
issue and they're talking to Intel.

looks similar to this thread:
  http://www.mail-archive.com/openib-general@openib.org/msg25305.html

>> is there a way to check pci burst settings from userland? or BIOS?
>You can see PCI settings with lspci. Newest lspci decode this value for
>you, with older once you'll have to dump PCI config space to the file
>and decode it by yourself.

ah, yes, thanks. lspci -vvv can see MaxReadReq.
for the IB card:

 MaxReadReq(bytes) kernel OS
     4096 2.6.16.21-0.8-smp sles10
     512 2.6.9-42.0.3.ELsmp centos4.4
     128 2.6.19.2 centos4.4
     128 2.6.18-1.2732.4.2.el5.OFED_1_1 centos4.4
     128 2.6.20-rc4 centos4.4
     4096 anything + tune_pci=1 centos4.4

so errr... I have no idea which is the correct one :-/
bandwidth is only crap with 128.

thanks for all your help.

cheers,
robin