Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] RFC: sm Latency
From: Eugene Loh (Eugene.Loh_at_[hidden])
Date: 2009-01-21 19:51:55


Patrick Geoffray wrote:

> Eugene Loh wrote:
>
>> Possibly, you meant to ask how one does directed polling with a
>> wildcard source MPI_ANY_SOURCE. If that was your question, the
>> answer is we punt. We report failure to the ULP, which reverts to
>> the standard code path.
>
> Sorry, I meant ANY_SOURCE. If you poll only the queue that correspond
> to a posted receive, you only optimize micro-benchmarks, until they
> start using ANY_SOURCE.

Right.

> So, does recvi() is a one-time shot ? Ie do you poll the right queue
> only once and if it fails then you fall back on polling all queues ?

You poll it "some". The BTL is granted some leeway in what
"immediately" means.

> If yes, then it's unobtrusive but I don't think it would help much.

Well, check the RFC. The data shows huge improvements in HPCC latency.

> If you poll the right queue many times, then you have to decide when
> to fall back on polling all queues, and it's not trivial.

It's not 100% satisfactory, but clearly OMPI (and every other MPI
implementation and just about any major piece of HPC software) is trying
to guess among all sorts of trade-offs. Many of those trade-offs are
user tunable -- hence, those pages and pages compiler options (pick your
favorite compiler), build flags, MCA parameters, etc.

>>> How do you ensure you check all incoming queues from time to time to
>>> prevent flow control (specially if the queues are small for scaling) ?
>>
>> There are a variety of choices here. Further, I'm afraid we
>> ultimately have to expose some of those choices to the user (MCA
>> parameters or something).
>
> In the vast majority of cases, users don't know how to turn the knobs.

Totally agree. Exposing these choices to the users is ugly and
expecting users to make such choices is ridiculous. Though, for what
it's worth:

% ompi_info -a | wc -l
1037
%

I actually agree with you a lot. I do think that my RFC represents one
step forward. I'll see how quickly I can prototype and characterize a
single-queue solution so we can judge alternatives more diligently.