Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] MPI_Iprobe and mca_btl_sm_component_progress
From: Brian W. Barrett (brbarret_at_[hidden])
Date: 2008-06-19 09:47:57

On Thu, 19 Jun 2008, Terry Dontje wrote:

> But my concern is not the raw performance of MPI_Iprobe in this case but more
> of an interaction between MPI and an application. The concern is if it takes
> 2 MPI_Iprobes to get to the real message (instead of one) then could this
> induce a synchronization delay in an application? That is by the
> application not receiving the "real" message in the first MPI_Iprobe it may
> decide to do other work while the other processes are potentially blocked
> waiting for it to do some communications.

I'd have to agree, which is why I proposed calling opal_progres at the
start of every iProbe. By the way, the 40us time hit sounds high, but
really should only happen if you have the TCP BTL actively talking to a
peer. Or you've hit the high water mark and are checking TCP
communication on the OOB connection.

>> In fact TCP has the potential to exhibit the same behavior. However, TCP
>> after each successful poll it empty the socket, so it might read more than
>> one message. As we have to empty the temporary buffer, we interpret most of
>> the messages inside, and this is why TCP exhibit a different behavior.
> I guess this difference in behavior between the SM BTL and TCP BTL is
> disturbing to me. Does just processing one fifo entry per sm_progress call
> per connection buying us performance? Would draining the acks be detrimental
> to performance? Wouldn't providing the messages at the time they arrived
> meet the rule of obviousness to application writers?
> I know there is a slippery slope here of saying ok you've read one message
> should read more until there is none on the fifo. I believe that is really
> debatable and could go either way depending on the application. But ack
> messages are not visible to the users. Which is why I was only asking about
> draining the ack packets.

Galen could say better than I, but I thought the IB BTL did basically what
you propose -- drain until you have a "real" message. That seems to make
the most sense to me and actually seemed to make IB run better for real
jobs, but what do I know?