Brian Barrett wrote:
> Personally, I'd rather just not mark MPI completion until a local
> completion callback from the BTL. But others don't like that idea, so
> we came up with a way for back pressure from the BTL to say "it's not
> on the wire yet". This is more complicated than just not marking MPI
> completion early, but why would we do something that helps real apps
> at the expense of benchmarks? That would just be silly!
FWIW this issue is also very relevant for the UD BTL, especially with
some new work I've done in the last week (currently having problems with
send-side completion semantics). I missed it, what was the reasoning
for not marking MPI completion until a callback from the BTL?