Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] why mx_forget in mca_btl_mx_prepare_dst?
From: George Bosilca (bosilca_at_[hidden])
Date: 2009-10-21 13:25:28


Brice,

Because MX doesn't provide a real RMA protocol, we created a fake one
on top of point-to-point. The two peers have to agree on a unique tag,
then the receiver posts it before the sender starts the send. However,
as this is integrated with the real RMA protocol, where only one side
knows about the completion of the RMA operation, we still exchange the
ACK at the end. Therefore, the receiver doesn't need to know when the
receive is completed, as it will get an ACK from the sender. At least
this was the original idea.

But I can see how this might fails if the short ACK from the sender
manage to pass the RMA operation on the wire. I was under the
impression (based on the fact that MX respect the ordering) that the
mx_send will trigger the completion only when all data is on the wire/
nic memory so I supposed there is _absolutely_ no way for the ACK to
bypass the last RMA fragments and to reach the receiver before the
recv is really completed. If my supposition is not correct, then we
should remove the mx_forget and make sure the that before we mark a
fragment as completed we got both completions (the one from mx_recv
and the remote one).

   george.

On Oct 21, 2009, at 04:33 , Brice Goglin wrote:

> Hello,
>
> I am debugging a crash with OMPI 1.3.3 BTL over Open-MX. It's crashing
> will trying to store incoming data in the OMPI receive buffer, but
> OMPI
> seems to have already freed the buffer even if the MX request is not
> complete yet. It looks like this is caused by mca_btl_mx_prepare_dst()
> posting the receive and then calling mx_forget() immediately. The OMPI
> r17452 by George introduced this. Commit log says "Improve the
> performance of the MX BTL. Correct the fake PUT protocol." I don't
> understand how this works.
>
> mx_forget() is supposed to be used when you don't care anymore about a
> message or a request, not really for performance purpose. It should
> not
> help much in "normal" cases since you usually need to know when the
> receive request is completed before you can actually use the received
> data. And completion order is not guaranteed anyway, so it's hard to
> guess when a request will complete if mx_forget() disabled the actual
> completion notification.
>
> Are you calling mx_forget() because you have another way to know when
> the message will be received? If so, how?
>
> When does OMPI free the fragment that is passed to mx_irecv in
> mca_btl_mx_prepare_dst?
>
> thanks,
> Brice
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel