Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2007-08-13 10:34:20

On Aug 12, 2007, at 3:49 PM, Gleb Natapov wrote:

>> - Mellanox tested MVAPICH with the header caching; latency was around
>> 1.4us
>> - Mellanox tested MVAPICH without the header caching; latency was
>> around 1.9us
> As far as I remember Mellanox results and according to our testing
> difference between MVAPICH with header caching and OMPI is 0.2-0.3us.
> Not 0.5us. And MVAPICH without header caching is actually worse then
> OMPI for small messages.

I guess reading the graph that Pasha sent is difficult; Pasha -- can
you send the actual numbers?

>> Given that OMPI is the lone outlier around 1.9us, I think we have no
>> choice except to implement the header caching and/or examine our
>> header to see if we can shrink it. Mellanox has volunteered to
>> implement header caching in the openib btl.
> I think we have a chose. Not implement header caching, but just
> change the
> osu_latency benchmark to send each message with different tag :)

If only. :-)

But that misses the point (and the fact that all the common ping-pong
benchmarks use a single tag: NetPIPE, IMB, osu_latency, etc.). *All
other MPI's* give us latency around 1.4us, but Open MPI is around
1.9us. So we need to do something.

Are we optimizing for a benchmark? Yes. But we have to do it. Many
people know that these benchmarks are fairly useless, but not enough
-- too many customers do not, and education is not enough. "Sure
this MPI looks slower but, really, it isn't. Trust me; my name is
Joe Isuzu." That's a hard sell.

> I am not against header caching per se, but if it will complicate code
> even a little bit I don't think we should implemented it just to
> benefit one
> fabricated benchmark (AFAIR before header caching was implemented in
> MVAPICH mpi_latency actually sent messages with different tags).

That may be true and a reason for us to wail and gnash our teeth, but
it doesn't change the current reality.

> Also there is really nothing to cache in openib BTL. Openin BTL
> header is 4
> bytes long. The caching will have to be done in OB1 and there it will
> affect every other interconnect.

Surely there is *something* we can do -- what, exactly, is the
objection to peeking inside the PML header down in the btl? Is it
really so horrible for a btl to look inside the upper layer's
header? I agree that the PML looking into a btl header would
[obviously] be Bad.

All this being said -- is there another reason to lower our latency?
My main goal here is to lower the latency. If header caching is
unattractive, then another method would be fine.

Jeff Squyres
Cisco Systems