Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] matching code rewrite in OB1
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2007-12-12 11:57:11


Gleb --

How about making a tarball with this patch in it that can be thrown at
everyone's MTT? (we can put the tarball on www.open-mpi.org somewhere)

On Dec 11, 2007, at 4:14 PM, Richard Graham wrote:

> I will re-iterate my concern. The code that is there now is mostly
> nine
> years old (with some mods made when it was brought over to Open
> MPI). It
> took about 2 months of testing on systems with 5-13 way network
> parallelism
> to track down all KNOWN race conditions. This code is at the center
> of MPI
> correctness, so I am VERY concerned about changing it w/o some very
> strong
> reasons. Not apposed, just very cautious.
>
> Rich
>
>
> On 12/11/07 11:47 AM, "Gleb Natapov" <glebn_at_[hidden]> wrote:
>
>> On Tue, Dec 11, 2007 at 08:36:42AM -0800, Andrew Friedley wrote:
>>> Possibly, though I have results from a benchmark I've written
>>> indicating
>>> the reordering happens at the sender. I believe I found it was
>>> due to
>>> the QP striping trick I use to get more bandwidth -- if you back
>>> down to
>>> one QP (there's a define in the code you can change), the reordering
>>> rate drops.
>> Ah, OK. My assumption was just from looking into code, so I may be
>> wrong.
>>
>>>
>>> Also I do not make any recursive calls to progress -- at least not
>>> directly in the BTL; I can't speak for the upper layers. The
>>> reason I
>>> do many completions at once is that it is a big help in turning
>>> around
>>> receive buffers, making it harder to run out of buffers and drop
>>> frags.
>>> I want to say there was some performance benefit as well but I
>>> can't
>>> say for sure.
>> Currently upper layers of Open MPI may call BTL progress function
>> recursively. I hope this will change some day.
>>
>>>
>>> Andrew
>>>
>>> Gleb Natapov wrote:
>>>> On Tue, Dec 11, 2007 at 08:03:52AM -0800, Andrew Friedley wrote:
>>>>> Try UD, frags are reordered at a very high rate so should be a
>>>>> good test.
>>>> Good Idea I'll try this. BTW I thing the reason for such a high
>>>> rate of
>>>> reordering in UD is that it polls for MCA_BTL_UD_NUM_WC completions
>>>> (500) and process them one by one and if progress function is
>>>> called
>>>> recursively next 500 completion will be reordered versus previous
>>>> completions (reordering happens on a receiver, not sender).
>>>>
>>>>> Andrew
>>>>>
>>>>> Richard Graham wrote:
>>>>>> Gleb,
>>>>>> I would suggest that before this is checked in this be tested
>>>>>> on a
>>>>>> system
>>>>>> that has N-way network parallelism, where N is as large as you
>>>>>> can find.
>>>>>> This is a key bit of code for MPI correctness, and out-of-order
>>>>>> operations
>>>>>> will break it, so you want to maximize the chance for such
>>>>>> operations.
>>>>>>
>>>>>> Rich
>>>>>>
>>>>>>
>>>>>> On 12/11/07 10:54 AM, "Gleb Natapov" <glebn_at_[hidden]> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I did a rewrite of matching code in OB1. I made it much
>>>>>>> simpler and 2
>>>>>>> times smaller (which is good, less code - less bugs). I also
>>>>>>> got rid
>>>>>>> of huge macros - very helpful if you need to debug something.
>>>>>>> There
>>>>>>> is no performance degradation, actually I even see very small
>>>>>>> performance
>>>>>>> improvement. I ran MTT with this patch and the result is the
>>>>>>> same as on
>>>>>>> trunk. I would like to commit this to the trunk. The patch is
>>>>>>> attached
>>>>>>> for everybody to try.
>>>>>>>
>>>>>>> --
>>>>>>> Gleb.
>>>>>>> _______________________________________________
>>>>>>> devel mailing list
>>>>>>> devel_at_[hidden]
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>> _______________________________________________
>>>>>> devel mailing list
>>>>>> devel_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>> --
>>>> Gleb.
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>> --
>> Gleb.
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
Jeff Squyres
Cisco Systems