Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] RFC: eliminating "descriptor" argument from sendi function
From: George Bosilca (bosilca_at_[hidden])
Date: 2009-02-24 10:43:27

Here is another way to write the code without having to pay the
expensive initialization of sendreq.
   first_time = 0;
   for ( btl = ... ) {
       if ( SUCCESS == sendi() ) return SUCCESS;
       if( 0 == first_time++) set_up_expensive_send_request(&sendreq);
       if ( SUCCESS == send(&sendreq) ) return SUCESS;

Anyway, the main problem is not in this code. The main problem is in
the fact that now instead of sharing the load over all available BTL
in a round-robin fashion, you overload the BTL(s) providing the sendi
function with small (and eager) messages, and you completely ignore
all the others until something goes wrong.

However, I can see one interesting point in your approach. As the BTLs
are indexed in increasing order of their published latency in the
eager array, we might benefit from the smallest latency for several
small messages before taking the most expensive path. But this is not
something we should tackle allegedly, as it modify the most
performance related parts of the PML.


On Feb 23, 2009, at 18:07 , Eugene Loh wrote:

> Eugene Loh wrote:
>> Actually, there may be a more important issue here.
>> Currently, the PML chooses the BTL first. Once the BTL choice is
>> established, only then does the PML choose between sendi and send.
>> Currently, it's also the case that we're spending a lot of time in
>> the PML doing a bunch of stuff that's totally unnecessary if the
>> sendi succeeds. So, we're neutralizing much of the advantage sendi
>> is supposed to provide.
>> So, I'm changing the PML to invoke sendi much sooner. The way I'm
>> doing this is to loop over BTLs, looking for a sendi that exists
>> and succeeds. If I find one, I'm done. If I don't, I have to go
>> with the standard send code path.
>> The logic, as I just described it, allows that multiple sendi
>> functions could fail and that the send that is ultimately used
>> might be for a different BTL than for any of the failing sendi's.
>> This would suggest that I do NOT want failing sendi's leaving any
>> side effects (like allocated descriptors).
>> Is my proposed logic bad? Should I implement things another way?
>> E.g., if I find a sendi function, use that BTL even if the sendi
>> failed and another BTL might have a sendi that could succeed? Or,
>> does my proposed change provide the justification for my pulling
>> descriptor allocations out of the sendi functions?
> Here's another way of looking at it.
> The current PML send code does this:
> set_up_expensive_send_request(&sendreq);
> for ( btl = ... ) {
> if ( SUCCESS == sendi() ) return SUCCESS;
> if ( SUCCESS == send(&sendreq) ) return SUCESS;
> }
> That is, we try one BTL after another. For each one, we try sendi
> first. So, each sendi() that fails is immediately followed by a
> send() of the same BTL. It's okay for a sendi() to do prep work for
> the send() of the same BTL. This scheme does a bunch of expensive
> send-request initialization that is unnecessary if the sendi(),
> which doesn't need the send request, succeeds.
> My proposed PML send logic is this:
> for ( btl = ... ) {
> if ( SUCCESS == sendi() ) return SUCCESS;
> }
> set_up_expensive_send_request(&sendreq);
> for ( btl = ... ) {
> if ( SUCCESS == send(&sendreq) ) return SUCCESS;
> }
> That is, if I can find a sendi() function, I use it. Only if I
> can't find any sendi() do I set up the send request and call send()
> functions.
> This is why I would like sendi() functions to have no side
> effects... e.g., no allocated descriptors.
> _______________________________________________
> devel mailing list
> devel_at_[hidden]