Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] How to force eager behavior during Isend?
From: Barry Rountree (rountree_at_[hidden])
Date: 2008-12-08 17:48:11

On Monday 08 December 2008 02:44:42 pm George Bosilca wrote:
> Barry,
> If you set the eager size large enough, the isend will not return
> until the data is pushed into the network layer.

That's exactly what I want it to do -- good. I've set the eagerness to 2MB,
but for messages 64k and up, Isend returns immediately and a significant
amount of time is spent in Wait. For messages less than 64k, the reverse is
true: a significant amount of time spent in Isend, and the Wait returns

> However, this doesn't
> guarantee that the data is delivered to the peer, but only that it was
> queued in the network (in the TCP case it is copied somewhere in the
> kernel buffers).


> The kernel will deliver that data doing a best
> effort, but there is no guarantee on that. As the kernel buffer has a
> limited size (no more than 128k) the expected graphical behavior for
> the isend operation over TCP should look like stairs, slightly going
> up because of the memcpy, and a large jump for every syscall required
> to do the operation.

That's fine.

> Now for the irecv the story is a lot more complex. The irecv only
> realize the matching, and if the data is not yet completely available
> (let's say only a small fragment was received at the moment of the
> irecv), the irecv will return (there is no eager notion there). The
> remaining of the data will became available only after the
> corresponding MPI_Wait.

What I'm trying to avoid -- and it may be a bad idea -- is having to decide
how much time in a given Wait is spent blocking or working. This becomes
simpler if I know that, for example, no work was done in the Isend or Irecv
call. If all the work was done in those calls, that's fine too (although
turning an Irecv into a blocking receive would probably break UMT2K and a few
other things). But if some of the time, some of the work is done in the
Isend/Irecv, I've got a more complex model. I'll have to deal with this
eventually, but I'd rather put it off if I could.

I was thinking that by manipulating the eagerness level (and increasing the
buffer sizes), I force this behavior. (I just tried setting the TCP and SELF
eagerness levels to 0, then 1. Still no change in behavior.)

PERUSE looks like it might be useful, and I'll continue looking into it.