Eugene Loh wrote:
> Actually, there may be a more important issue here.
>
> Currently, the PML chooses the BTL first. Once the BTL choice is
> established, only then does the PML choose between sendi and send.
>
> Currently, it's also the case that we're spending a lot of time in the
> PML doing a bunch of stuff that's totally unnecessary if the sendi
> succeeds. So, we're neutralizing much of the advantage sendi is
> supposed to provide.
>
> So, I'm changing the PML to invoke sendi much sooner. The way I'm
> doing this is to loop over BTLs, looking for a sendi that exists and
> succeeds. If I find one, I'm done. If I don't, I have to go with the
> standard send code path.
>
> The logic, as I just described it, allows that multiple sendi
> functions could fail and that the send that is ultimately used might
> be for a different BTL than for any of the failing sendi's. This
> would suggest that I do NOT want failing sendi's leaving any side
> effects (like allocated descriptors).
>
> Is my proposed logic bad? Should I implement things another way?
> E.g., if I find a sendi function, use that BTL even if the sendi
> failed and another BTL might have a sendi that could succeed? Or,
> does my proposed change provide the justification for my pulling
> descriptor allocations out of the sendi functions?
Here's another way of looking at it.
The current PML send code does this:
set_up_expensive_send_request(&sendreq);
for ( btl = ... ) {
if ( SUCCESS == sendi() ) return SUCCESS;
if ( SUCCESS == send(&sendreq) ) return SUCESS;
}
That is, we try one BTL after another. For each one, we try sendi
first. So, each sendi() that fails is immediately followed by a send()
of the same BTL. It's okay for a sendi() to do prep work for the send()
of the same BTL. This scheme does a bunch of expensive send-request
initialization that is unnecessary if the sendi(), which doesn't need
the send request, succeeds.
My proposed PML send logic is this:
for ( btl = ... ) {
if ( SUCCESS == sendi() ) return SUCCESS;
}
set_up_expensive_send_request(&sendreq);
for ( btl = ... ) {
if ( SUCCESS == send(&sendreq) ) return SUCCESS;
}
That is, if I can find a sendi() function, I use it. Only if I can't
find any sendi() do I set up the send request and call send() functions.
This is why I would like sendi() functions to have no side effects...
e.g., no allocated descriptors.
|