Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] RFC: eliminating "descriptor" argument from sendi function
From: Eugene Loh (Eugene.Loh_at_[hidden])
Date: 2009-02-24 12:52:48


George Bosilca wrote:

> Here is another way to write the code without having to pay the
> expensive initialization of sendreq.
> first_time = 0;
> for ( btl = ... ) {
> if ( SUCCESS == sendi() ) return SUCCESS;
> if( 0 == first_time++) set_up_expensive_send_request(&sendreq);
> if ( SUCCESS == send(&sendreq) ) return SUCESS;
> }

Sure. Well, things are complicated by the fact that
"set_up_expensive_send_request()" is not a factored-out function. So,
restructuring code to look like this is a hassle. But, let's first
figure out what we *want* to do and then tackle what is merely a simple
matter of implementation! :^)

> Anyway, the main problem is not in this code. The main problem is in
> the fact that now instead of sharing the load over all available BTL
> in a round-robin fashion, you overload the BTL(s) providing the sendi
> function with small (and eager) messages, and you completely ignore
> all the others until something goes wrong.
>
> However, I can see one interesting point in your approach. As the
> BTLs are indexed in increasing order of their published latency in
> the eager array, we might benefit from the smallest latency for
> several small messages before taking the most expensive path. But
> this is not something we should tackle allegedly, as it modify the
> most performance related parts of the PML.

I would like to understand this better. Let's say you can reach your
destination via two BTLs: sm and TCP. I don't know what the numbers
are, but let's say TCP latency is >10x slower than sm latency. Are you
saying we want to roundrobin between the two BTLs? And to do otherwise
would modify a lot of the PML? Like what?

I can imagine cases where one might have comparable BTLs and want to
round robin them. But, if one BTL is much faster than another, I would
want to use the faster one. Period. Especially if it had a sendi function.