Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] trac #2034 : single rail openib btl shows better bandwidth than dual rail (12k< x < 128k)
From: Don Kerr (Don.Kerr_at_[hidden])
Date: 2009-10-09 14:55:19


On 10/08/09 17:14, Don Kerr wrote:
> George,
>
> This is an interesting approach although I am guessing the changes
> would be wide spread and have many performance implications. Am I
> wrong in this belief?
My point here is that if this is going to have as many performance
implications as I think it will, it probably makes sense to investigate
the potential bigger dual-rail issue and consider the "never share"
approach in the larger context.

-DON
>
>
> -DON
>
> On 10/08/09 11:45, George Bosilca wrote:
>> Don,
>>
>> I think we can do something slightly different that will satisfy
>> everybody.
>>
>> How about a solution where each BTL will define a limit where a
>> message will never be shared with another BTL? We can have two such
>> limits, one for the send protocol and one for the RMA (it will apply
>> either to PUT or GET operations based on the BTL support and PML
>> decision).
>>
>> george.
>>
>> On Oct 8, 2009, at 11:01 , Don Kerr wrote:
>>
>>>
>>>
>>> On 10/07/09 13:52, George Bosilca wrote:
>>>> Don,
>>>>
>>>> The problem is that a particular BTL doesn't have the knowledge
>>>> about the other selected BTL, so allowing the BTLs to set this
>>>> limit is not as easy as it sound. However, in the case two
>>>> identical BTLs are selected and that they are the only ones, this
>>>> clearly is a better approach.
>>>>
>>>> If this parameter is set at the PML level, I can't imagine how we
>>>> figure out the correct value depending on the BTLs.
>>>>
>>>> I see this as a pretty strong restriction. How do we know we set a
>>>> value that make sense?
>>> OK, I now see why setting at btl level is difficult. And for the
>>> case of multiple btls which are also different component types,
>>> however unlikely that is, a pml setting will not be optimal for both.
>>>
>>> -DON
>>>
>>>
>>>>
>>>> george.
>>>>
>>>> On Oct 7, 2009, at 10:19 , Don Kerr wrote:
>>>>
>>>>> George,
>>>>>
>>>>> Were you suggesting that the proposed new parameter
>>>>> "max_rdma_single_rget" be set by the individual btls similar to
>>>>> "btl_eager_limit"? Seems to me to that is the better approach if
>>>>> I am to move forward with this.
>>>>>
>>>>> -DON
>>>>>
>>>>> On 10/06/09 11:14, Don Kerr wrote:
>>>>>> I agree there is probably a larger issue here and yes this is
>>>>>> somewhat specific but where as OB1 appears to have multiple
>>>>>> protocols depending on the capabilities of the BTLs I would not
>>>>>> characterize as an IB centric problem. Maybe OB1 RDMA problem.
>>>>>> There is a clear benefit from modifying this specific case. Do
>>>>>> you think its not worth making incremental improvements while
>>>>>> also attacking a potential bigger issue?
>>>>>>
>>>>>> -DON
>>>>>>
>>>>>> On 10/06/09 10:52, George Bosilca wrote:
>>>>>>> Don,
>>>>>>>
>>>>>>> This seems a very IB centric problem (and solution) going up in
>>>>>>> the PML. Moreover, I noticed that independent on the BTL we have
>>>>>>> some problems with the multi-rail performance. As an example on
>>>>>>> a cluster with 3 GB cards we get the same performance is I
>>>>>>> enable 2 or 3. Didn't had time to look into the details, but
>>>>>>> this might be a more general problem.
>>>>>>>
>>>>>>> george.
>>>>>>>
>>>>>>> On Oct 6, 2009, at 09:51 , Don Kerr wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> I intend to make the change suggested in this ticket to the
>>>>>>>> trunk. The change does not impact single rail, tested with
>>>>>>>> openib btl, case and does improve dual rail case. Since it does
>>>>>>>> involve performance and I am adding a OB1 mca parameter just
>>>>>>>> wanted to check if anyone was interested or had an issue with
>>>>>>>> it before I committed the change.
>>>>>>>>
>>>>>>>> -DON
>>>>>>>> _______________________________________________
>>>>>>>> devel mailing list
>>>>>>>> devel_at_[hidden]
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> devel mailing list
>>>>>>> devel_at_[hidden]
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>> _______________________________________________
>>>>>> devel mailing list
>>>>>> devel_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel