Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] New BTL parameter
From: Gleb Natapov (glebn_at_[hidden])
Date: 2007-12-13 01:37:35

On Wed, Dec 12, 2007 at 01:18:10PM -0800, Paul H. Hargrove wrote:
> Gleb Natapov wrote:
> > On Wed, Dec 12, 2007 at 02:03:02PM -0500, Jeff Squyres wrote:
> >
> >> On Dec 9, 2007, at 10:34 AM, Gleb Natapov wrote:
> >>
> >>
> >>> Currently BTL has parameter btl_min_send_size that is no longer used.
> >>> I want to change it to be btl_rndv_eager_limit. This new parameter
> >>> will
> >>> determine a size of a first fragment of rendezvous protocol. Now we
> >>> use
> >>> btl_eager_limit to set its size. btl_rndv_eager_limit will have to be
> >>> smaller or equal to btl_eager_limit. By default it will be equal to
> >>> btl_eager_limit so no behavior change will be observed if default is
> >>> used.
> >>>
> >> Can you describe why it would be better to have the value less than
> >> the eager limit?
> >>
> >>
> > It is just one more knob to tune OB1 algorithm. I sometimes don't want
> > to send any data by copy in/out at all. This is not possible right now.
> > With this new param I will be able to control this.
> >
> From my experience tuning RDMA-rendezvous for the GASNet communications
> library, I know that it was beneficial to piggyback some portion of the
> payload on the rendezvous request. However, the best [insert your
> favorite performance metric here] was not always achieved by
> piggybacking the maximum that could be buffered at the receiver
> (equivalent of blt_eager_limit). If I understand correctly, Gleb's
> btl_rndv_eager_limit parameter would allow tuning for this behavior in OMPI.
Exactly. You explained it better than me.

> An artificial/simplified example would be if the eager limit is 32K and
> you have a 64K xfer. Is it better to send 32K copy in/out plus 32K by
> RDMA, or to send 8K copy in/out plus 56K by RDMA? If the memcpy()
> overhead for 32K of eager payload exceeds what can be overlapped with
> the rendezvous setup then the second may be the better choice (higher
> bandwidth, lower latency, and lower CPU overheads on both sender and
> receiver).