Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] mca_pml_ob1_send blocks
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2009-09-01 23:53:15


Sorry for the delay in replying...

On Sep 1, 2009, at 1:11 AM, Shaun Jackman wrote:

> > Looking at the source code of MPI_Request_get_status, it...
> > calls OPAL_CR_NOOP_PROGRESS()
> > returns true in *flag if request->req_complete
> > calls opal_progress()
> > returns false in *flag
>

Keep in mind that MPI_REQUEST_GET_STATUS is exactly the same as
MPI_TEST except that the MPI_Request will not be deallocated if the
request has completed.

> > What's the difference between OPAL_CR_NOOP_PROGRESS() and
> > opal_progress()? If the request has already completed, does it mean
> > that since opal_progress() is not called, no further progress is
> made?
>
> OPAL_CR_NOOP_PROGRESS() seems to be related to checkpoint/restart and
> is a no-op unless fault-tolerance is being used.
>

Correct.

> Two questions then...
>
> 1. If the request has already completed, does it mean that since
> opal_progress() is not called, no further progress is made?
>

Correct. It's a latency thing; if your request has already completed,
we just tell you without further delay (i.e., without invoking
opal_progress(), which may trigger lots of other things, and therefore
increase the latency of MPI_REQUEST_GET_STATUS returning).

opal_progress() is our lowest-level progression engine call. It kicks
all kinds of registered progression callbacks from all over the code
base.

> 2. request->req_complete is tested before calling opal_progress(). Is
> it possible that request->req_complete is now true after calling
> opal_progress() when this function returns false in *flag?
>

Yes. I suppose it could be an optimization to duplicate the block
testing for request->req_complete==true below the call to
opal_progress(). I'm guessing the only reason it wasn't done was to
avoid code duplication. Additionally, the call to opal_progress() is
surrounded by an #if block testing OPAL_ENABLE_PROGRESS_THREADS -- if
we have progress threads enabled, the thought was that opal_progress()
(and friends) would be invoked automatically (and probably
continuously) by other threads. The progression thread code is not
well tested -- I'd be surprised if it worked at all, because I doubt
anyone is testing it -- but it has been in our design since the very
beginning. This is likely another reason we don't test again for
req_complete==true after the call to opal_progress() -- because that
block would need to be protected by that #if, leading to further code
complexity.

-- 
Jeff Squyres
jsquyres_at_[hidden]