Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI devel] New BTL parameter
From: Gleb Natapov (glebn_at_[hidden])
Date: 2007-12-13 01:37:35

On Wed, Dec 12, 2007 at 01:18:10PM -0800, Paul H. Hargrove wrote:
> Gleb Natapov wrote:
> > On Wed, Dec 12, 2007 at 02:03:02PM -0500, Jeff Squyres wrote:
> >
> >> On Dec 9, 2007, at 10:34 AM, Gleb Natapov wrote:
> >>
> >>
> >>> Currently BTL has parameter btl_min_send_size that is no longer used.
> >>> I want to change it to be btl_rndv_eager_limit. This new parameter
> >>> will
> >>> determine a size of a first fragment of rendezvous protocol. Now we
> >>> use
> >>> btl_eager_limit to set its size. btl_rndv_eager_limit will have to be
> >>> smaller or equal to btl_eager_limit. By default it will be equal to
> >>> btl_eager_limit so no behavior change will be observed if default is
> >>> used.
> >>>
> >> Can you describe why it would be better to have the value less than
> >> the eager limit?
> >>
> >>
> > It is just one more knob to tune OB1 algorithm. I sometimes don't want
> > to send any data by copy in/out at all. This is not possible right now.
> > With this new param I will be able to control this.
> >
> From my experience tuning RDMA-rendezvous for the GASNet communications
> library, I know that it was beneficial to piggyback some portion of the
> payload on the rendezvous request. However, the best [insert your
> favorite performance metric here] was not always achieved by
> piggybacking the maximum that could be buffered at the receiver
> (equivalent of blt_eager_limit). If I understand correctly, Gleb's
> btl_rndv_eager_limit parameter would allow tuning for this behavior in OMPI.
Exactly. You explained it better than me.

> An artificial/simplified example would be if the eager limit is 32K and
> you have a 64K xfer. Is it better to send 32K copy in/out plus 32K by
> RDMA, or to send 8K copy in/out plus 56K by RDMA? If the memcpy()
> overhead for 32K of eager payload exceeds what can be overlapped with
> the rendezvous setup then the second may be the better choice (higher
> bandwidth, lower latency, and lower CPU overheads on both sender and
> receiver).