Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] calling sendi earlier in the PML
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2009-03-03 15:48:36


On Mar 3, 2009, at 3:31 PM, Eugene Loh wrote:

> First, this behavior is basically what I was proposing and what
> George didn't feel comfortable with. It is arguably no compromise
> at all. (Uggh, why must I be so honest?) For eager messages, it
> favors BTLs with sendi functions, which could lead to those BTLs
> becoming overloaded. I think favoring BTLs with sendi for short
> messages is good. George thinks that load balancing BTLs is good.
>
> Second, the implementation can be simpler than you suggest:
>
> *) You don't need a separate list since testing for a sendi-enabled
> BTL is relatively cheap (I think... could verify).
> *) You don't need to shuffle the list. The mechanism used by ob1
> just resumes the BTL search from the last BTL used. E.g., check https://svn.open-mpi.org/source/xref/ompi_1.3/ompi/mca/pml/ob1/pml_ob1_sendreq.h
> #mca_pml_ob1_send_request_start . You use
> mca_bml_base_btl_array_get_next(&btl_eager) to roundrobin over BTLs
> in a totally fair manner (remembering where the last loop left off),
> and using mca_bml_base_btl_array_get_size(&btl_eager) to make sure
> you don't loop endlessly.

Cool / fair enough.

How about an MCA parameter to switch between this mechanism (early
sendi) and the original behavior (late sendi)?

This is the usual way that we resolve "I want to do X / I want to do
Y" disputes. :-)

> I've been toying with two implementations. The one I described in
> San Jose was called FAST, so let's still call it that. It tests for
> sendi early in the PML, calling traditional send only if no sendi is
> found for any BTL. To preserve the BTL ordering George favors
> (always roundrobinning over BTLs, looking only secondarily for
> sendi), I tried another implementation I'll call FAIR. It attempts
> to initialize the send request only very minimally. One still makes
> a number of function calls and goes "deep" into the PML, but defers
> as much send-request initialization as late as possible. I can't
> promise that both implementations FAST and FAIR are equally rock
> solid or optimized, but this is where I am so far. The differences
> are:
>
> *) FAST involves far fewer code changes.
> *) FAST produces faster latencies. E.g., for 0-byte OSU latencies,
> FAST is 8-10% better than OMPI while FAIR is only 1-3% (or 2-3%...
> something like that). (The improvements I showed in San Jose for
> FAST were more dramatic than 8-10%, but that's because there were
> optimizations on the receive side and in the data convertors as
> well. For the e-mail you're reading right now, I'm talking just
> about send-request optimizations.)
> *) Theoretically, FAIR is broader reaching. E.g., if persistent
> sends can always use a sendi path, they will all potentially
> benefit. (This is theory. I haven't actually observed such a speed-
> up yet and it might just end up getting lost in the noise.)
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
Jeff Squyres
Cisco Systems