Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Gleb Natapov (glebn_at_[hidden])
Date: 2007-05-27 09:49:11

On Fri, May 25, 2007 at 09:31:33PM -0600, Galen Shipman wrote:
> On May 24, 2007, at 2:48 PM, George Bosilca wrote:
> > I see the problem this patch try to solve, but I fail to correctly
> > understand the implementation. The patch affect all PML and BTL in
> > the code base by adding one more argument to some of the most often
> > called functions. And there is only one BTL (openib) who seems to
> > use it while all others completely ignore it. Moreover, there seems
> > to be already a very similar mechanism based on the
> > MCA_BTL_DES_FLAGS_PRIORITY flag, which can be set by the PML level
> > into the btl_descriptor.
> >
> > So what's the difference between the additional argument and a
> > correct usage of the MCA_BTL_DES_FLAGS_PRIORITY flag ?
> The problem is that MCA_BTL_DES_FLAGS_PRIORITY was meant to indicate
> that the fragment was higher priority, but the fragment isn't higher
> priority. It simply needs to be ordered w.r.t. a previous fragment,
> an RDMA in this case.
But after the change priority flags is totally ignored.

> This being said, we could have just added an rdma fin flag, but this
> would mix protocol a bit too much between the BTL and the PML in my
> opinion.
> What we have with this fix is that the BTL can assign an order tag to
> any descriptor if it wishes, this order tag is only valid after a
> call to btl_send or btl_put/get. This order tag can then be used to
With current code this is not the case. Order tag is set during a fragment
allocation. It seems wrong according to your description. Attached patch fixes
this. If no specific ordering tag is provided to allocation function order of
the fragment is set to be MCA_BTL_NO_ORDER. After call to send/put/get order
is set to whatever QP was used for communication. If order is set before send call
it is used to choose QP.