Le 08/09/2010 14:02, Jeff Squyres a écrit :
> On Sep 3, 2010, at 3:38 PM, George Bosilca wrote:
>> However, going over the existing BTLs I can see that some BTLs do not correctly set this value:
>> BTL Bandwidth Auto-detect Status
>> Elan 2000 NO Correct
>> GM 250 NO Doubtful
>> MX 2000/10000 YES (Mbs) Correct (before the patch)
>> OFUD 800 NO Doubtful
>> OpenIB 2000/4000/8000 YES (Mbs) Correct (multiplied by the active_width)
>> Portals 1000 NO Doubtful
>> SCTP 100 NO Conservative value (correct)
>> Self 100 XXX Correct (doesn't matter anyway)
>> SM 9000 NO Correct
>> TCP 100 NO Conservative value (correct)
>> UDAPL 225 NO Incorrect
> Now that that patch has been rolled back out, did we come to conclusion here?
> - OFUD: why do we still even have this?
> - Portals: does it matter if it gets it wrong? No one will ever multi-rail with it.
> - TCP: we can add auto-detect code for this (But doesn't have to be right away -- i.e., don't make 1.5.0 wait for it).
> - UDAPL: I don't think anyone will multi-rail udapl with anything.
> Was the *real* problem that Brice's OpenFabrics bandwidth was auto-detected incorrectly somehow?
The first problem came from IB not autodetecting at all by default and
using 800Mbit/s instead. When forcing autodetect with mca parameters,
the bandwidth are not perfect but not too bad. When forcing IB manually
to the right bandwidth value, I can tweak things as needed.