Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Galen Shipman (gshipman_at_[hidden])
Date: 2007-06-07 14:10:58

I just had a discussion with Rich regarding the BTL semantics.
I think what might be helpful here is for us to have telecon to
discuss this further.

I only have one goal out of this, and that is to firmly define the
ordering semantics of the BTL, or alternatively local/remote
completion semantics of the BTL, whatever they may be.

I have created a wiki here to help describe the issue as I currently
see it, please feel free to add to this with suggestions/etc..

- Galen

On Jun 7, 2007, at 9:55 AM, Galen Shipman wrote:

> On Jun 7, 2007, at 9:11 AM, George Bosilca wrote:
>> There is something weird with this change, and the patch reflect
>> it. The new argument "order" come from the PML level and might be
>> MCA_BTL_NO_ORDER (which is kind of global) or BTL_OPENIB_LP_QP or
>> BTL_OPENIB_HP_QP (which are definitively Open IB related). Do you
>> really intend to let the PML knows about Open IB internal constants ?
> No, the PML knows only one thing about the order tag, it is either
> MCA_BTL_NO_ORDER or it is something that the BTL assigns.
> The PML has no idea about BTL_OPENIB_LP_QP or BTL_OPENIB_HP_QP, to
> the PML it is just an order tag assigned to a fragment by the BTL.
> So the semantics are that after a btl_send/put/get an order tag may
> be assigned by the BTL to the descriptor,
> This order tag can then be specified to subsequent calls to btl_alloc
> or btl_prepare. The PML has no idea what the value means, other than
> he is requesting a descriptor that will be ordered w.r.t. a
> previously transmitted descriptor.
>> If it's the case (which seems to be true from the following snippet
>> if(MCA_BTL_NO_ORDER == order) {
>> frag->base.order = BTL_OPENIB_LP_QP;
>> } else {
>> frag->base.order = order;
>> }
> So I am choosing some ordering to use here because the PML told me he
> doesn't care, what is wrong with this?
>> ) I expect you to revise the patch in order to propose a generic
>> solution or I'll trigger a vote against the patch.
> This exports no knowledge of the Open IB BTL to the PML layer, the
> PML doesn't know that this is a QP index, he doesn't care! The PML
> simply uses this value (if it wants to) to request ordering with
> subsequent fragments. We use the QP index only as a BTL optimization,
> it could have been anything. So the only new knowledge that the PML
> has is how to request that ordering of fragments be enforced, and the
> BTL doesn't even have to provide this if it doesn't want, that is the
> reason for MCA_BTL_NO_ORDER.
> Please describe a use case where this is not a generic solution. Keep
> in mind that MX, TCP, GM all can provide ordering guarantees if they
> wish, in fact for MX you can simply always assign an order tag, say
> the value is 1. MX can then guarantee ordering of all fragments sent
> over the same BTL.
>> I vote to be backed out of the trunk as it export way to much
>> knowledge from the Open IB BTL into the PML layer.
> The only other option that I have identified that doesn't push PML
> level protocol into the BTL is to require that BTLs always guarantee
> ordering of fragments sent/put/get over the same BTL.
>> george.
>> PS: With Gleb changes the problem is the same. The following
>> snippet reflect exactly the same behavior as the original patch.
> Gleb's changes don't change the semantic guarantees that I have
> described above.
>> frag->base.order = order;
>> assert(frag->base.order != BTL_OPENIB_HP_QP);
>> On Jun 7, 2007, at 9:49 AM, Gleb Natapov wrote:
>>> Hi Galen,
>>> On Sun, May 27, 2007 at 10:19:09AM -0600, Galen Shipman wrote:
>>>>> With current code this is not the case. Order tag is set during a
>>>>> fragment
>>>>> allocation. It seems wrong according to your description. Attached
>>>>> patch fixes
>>>>> this. If no specific ordering tag is provided to allocation
>>>>> function order of
>>>>> the fragment is set to be MCA_BTL_NO_ORDER. After call to send/
>>>>> put/
>>>>> get order
>>>>> is set to whatever QP was used for communication. If order is set
>>>>> before send call
>>>>> it is used to choose QP.
>>>> I do set the order tag during allocation/prepare, but the defined
>>>> semantics are that the tag is only valid after send/put/get. We can
>>>> set them up any where we wish in the BTL, the PML however cannot
>>>> rely
>>>> on anything until after the send/put/get call. So really this is an
>>>> issue of semantics versus implementation. The implementation I
>>>> believe does conform to the semantics as the upper layer (PML)
>>>> doesn't use the tag value until after a call to send/put/get.
>>>> I will look over the patch however, might make more sense to delay
>>>> setting the value until the actual send/put/get call.
>>> Have you had a chance to look over the patch?
>>> --
>>> Gleb.
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
> _______________________________________________
> devel mailing list
> devel_at_[hidden]