Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] matching code rewrite in OB1
From: Andrew Friedley (afriedle_at_[hidden])
Date: 2007-12-11 11:36:42

Possibly, though I have results from a benchmark I've written indicating
the reordering happens at the sender. I believe I found it was due to
the QP striping trick I use to get more bandwidth -- if you back down to
one QP (there's a define in the code you can change), the reordering
rate drops.

Also I do not make any recursive calls to progress -- at least not
directly in the BTL; I can't speak for the upper layers. The reason I
do many completions at once is that it is a big help in turning around
receive buffers, making it harder to run out of buffers and drop frags.
  I want to say there was some performance benefit as well but I can't
say for sure.


Gleb Natapov wrote:
> On Tue, Dec 11, 2007 at 08:03:52AM -0800, Andrew Friedley wrote:
>> Try UD, frags are reordered at a very high rate so should be a good test.
> Good Idea I'll try this. BTW I thing the reason for such a high rate of
> reordering in UD is that it polls for MCA_BTL_UD_NUM_WC completions
> (500) and process them one by one and if progress function is called
> recursively next 500 completion will be reordered versus previous
> completions (reordering happens on a receiver, not sender).
>> Andrew
>> Richard Graham wrote:
>>> Gleb,
>>> I would suggest that before this is checked in this be tested on a system
>>> that has N-way network parallelism, where N is as large as you can find.
>>> This is a key bit of code for MPI correctness, and out-of-order operations
>>> will break it, so you want to maximize the chance for such operations.
>>> Rich
>>> On 12/11/07 10:54 AM, "Gleb Natapov" <glebn_at_[hidden]> wrote:
>>>> Hi,
>>>> I did a rewrite of matching code in OB1. I made it much simpler and 2
>>>> times smaller (which is good, less code - less bugs). I also got rid
>>>> of huge macros - very helpful if you need to debug something. There
>>>> is no performance degradation, actually I even see very small performance
>>>> improvement. I ran MTT with this patch and the result is the same as on
>>>> trunk. I would like to commit this to the trunk. The patch is attached
>>>> for everybody to try.
>>>> --
>>>> Gleb.
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
> --
> Gleb.
> _______________________________________________
> devel mailing list
> devel_at_[hidden]