Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] trac #2034 : single rail openib btl shows better bandwidth than dual rail (12k< x < 128k)
From: George Bosilca (bosilca_at_[hidden])
Date: 2009-10-07 13:52:10


Don,

The problem is that a particular BTL doesn't have the knowledge about
the other selected BTL, so allowing the BTLs to set this limit is not
as easy as it sound. However, in the case two identical BTLs are
selected and that they are the only ones, this clearly is a better
approach.

If this parameter is set at the PML level, I can't imagine how we
figure out the correct value depending on the BTLs.

I see this as a pretty strong restriction. How do we know we set a
value that make sense?

   george.

On Oct 7, 2009, at 10:19 , Don Kerr wrote:

> George,
>
> Were you suggesting that the proposed new parameter
> "max_rdma_single_rget" be set by the individual btls similar to
> "btl_eager_limit"? Seems to me to that is the better approach if I
> am to move forward with this.
>
> -DON
>
> On 10/06/09 11:14, Don Kerr wrote:
>> I agree there is probably a larger issue here and yes this is
>> somewhat specific but where as OB1 appears to have multiple
>> protocols depending on the capabilities of the BTLs I would not
>> characterize as an IB centric problem. Maybe OB1 RDMA problem.
>> There is a clear benefit from modifying this specific case. Do you
>> think its not worth making incremental improvements while also
>> attacking a potential bigger issue?
>>
>> -DON
>>
>> On 10/06/09 10:52, George Bosilca wrote:
>>> Don,
>>>
>>> This seems a very IB centric problem (and solution) going up in
>>> the PML. Moreover, I noticed that independent on the BTL we have
>>> some problems with the multi-rail performance. As an example on a
>>> cluster with 3 GB cards we get the same performance is I enable 2
>>> or 3. Didn't had time to look into the details, but this might be
>>> a more general problem.
>>>
>>> george.
>>>
>>> On Oct 6, 2009, at 09:51 , Don Kerr wrote:
>>>
>>>>
>>>> I intend to make the change suggested in this ticket to the
>>>> trunk. The change does not impact single rail, tested with
>>>> openib btl, case and does improve dual rail case. Since it does
>>>> involve performance and I am adding a OB1 mca parameter just
>>>> wanted to check if anyone was interested or had an issue with it
>>>> before I committed the change.
>>>>
>>>> -DON
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel