Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI devel] [PATCH] fix mx btl_bandwidth
From: George Bosilca (bosilca_at_[hidden])
Date: 2010-09-03 11:33:42

On Sep 3, 2010, at 09:50 , Brice Goglin wrote:

> Le 03/09/2010 15:38, George Bosilca a écrit :
>> Jeff,
>> I think you will have to revert this patch as the btl_bandwidth __IS__ supposed to be in Mbs and not MBs. We usually talk about networks in Mbs (there is a pattern in Ethernet 1G/10G, Myricom 10G). In addition the original design of the multi-rail was based on this assumption, and the multi-rail handling code deal with these values (at that level I don't think it really matters, but at least it needs consistent values from all BTLs).
>> However, going over the existing BTLs I can see that some BTLs do not correctly set this value:
>> BTL Bandwidth Auto-detect Status
>> Elan 2000 NO Correct
> 2000 looks strange to me. Last time I played with Elan4, bandwidth was
> 900MB/s or so.

Lucky you ;) The 2000 was the bandwidth of the last Elan device we had.

>> GM 250 NO Doubtful
>> MX 2000/10000 YES (Mbs) Correct (before the patch)
>> OFUD 800 NO Doubtful
>> OpenIB 2000/4000/8000 YES (Mbs) Correct (multiplied by the active_width)
> I found the problem when using both MX and OpenIB at the same time, so
> they can't be both wrong or both correct. IB was reporting 800, not
> 2000/4000/8000. Maybe because auto-detect didn't work and the default is
> wrong:
> btl_openib_mca.c:527: mca_btl_openib_module.super.btl_bandwidth = 800;

It appears that Open IB only auto-detect the bandwidth if the value is explicitly set to zero via the mca parameters. As a last resort: as for the other devices you can set it manually. Use something like btl_openib_bandwidth_%dev_name% to set the bandwidth per device.


> Brice
> _______________________________________________
> devel mailing list
> devel_at_[hidden]