Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI devel] matching code rewrite in OB1
From: Gleb Natapov (glebn_at_[hidden])
Date: 2007-12-14 02:20:12


On Thu, Dec 13, 2007 at 06:16:49PM -0500, Richard Graham wrote:
> The situation that needs to be triggered, just as George has mentions, is
> where we have a lot of unexpected messages, to make sure that when one that
> we can match against comes in, all the unexpected messages that can be
> matched with pre-posted receives are matched. Since we attempt to match
> only when a new fragment comes in, we need to make sure that we don't leave
> other unexpected messages that can be matched in the unexpected queue, as
> these (if the out of order scenario is just right) would block any new
> matches from occurring.
>
> For example: Say the next expect message is 25
>
> Unexpected message queue has: 26 28 29 ..
>
> If 25 comes in, and is handled, if 26 is not pulled off the unexpected
> message queue, when 27 comes in it won't be able to be matched, as 26 is
> sitting in the unexpected queue, and will never be looked at again ...
This situation is triggered constantly with openib BTL. OpenIB BTL has
two ways to receive a packet: over a send queue or over an eager RDMA path.
Receiver polls both of them and may reorders packets locally. Actually
currently there is a bug in openib BTL that one channel may starve the other
at the receiver so if a match fragment with a next sequence number is in the
starved path tenth of thousands fragment can be reorederd. Test case attached
to ticket #1158 triggers this case and my patch handles all reordered packets.

And, by the way, the code is much simpler now and can be review easily ;)

--
			Gleb.