Jeff Squyres (jsquyres) wrote:
>> -----Original Message-----
>> From: devel-bounces_at_[hidden]
>> [mailto:devel-bounces_at_[hidden]] On Behalf Of Patrick Geoffray
>> Sent: Wednesday, June 28, 2006 1:23 PM
>> To: Open MPI Developers
>> Subject: Re: [OMPI devel] Best bw/lat performance for
>> microbenchmark/debug utility
>> Josh Aune wrote:
>>> I am writing up some interconnect/network debugging software that is
>>> centered around ompi. What is the best set of functions to
> I was assuming that you would be testing latency/bandwidth, but Patrick
> is correct in stating that there are many more things to test than just
> those two metrics.
There are a lot of metrics, but most of them require deep understanding
of the MPI semantics and implementation details to make sense. The art
of micro-benchmark is to choose the metrics and explain why they matter.
It's obvious for latency/bandwidth, a bit less for unexpected and host
overhead, definitively hard for overlap and progress. And that's just
To avoid reinventing the wheel, I would suggest to Josh to develop a
micro-benchmark test suite to compute a very detailed LogP-derived
parameters, ie for all message sizes:
* send overhead (o.s) and recv overhead (o.r). These overheads will
likely be either constant or linear for various message size ranges, it
would be great to automatically compute the ranges.
Memory registration cost is accounted here, so it would useful to
measure with and without registration cache also.
* Latency (L).
* Send gap (g.s) and recv gap (g.r). For large messages, they will
likely be identical and represent the link bandwidth. For smaller
messages, the send gap is the gap of a fan-out pattern (1->N) and the
recv gap is the gap of a flat gather (N->1). It's important to not have
the send or recv overhead hiding the send or recv gap, using several
processes could be used to dive the send/recv overhead.
* unexpected overhead (o.u). Overhead added to (o.r) when the message is
not immediately matched.
* overlap availability (a) that is the percentage of communication time
that you can overlap with real host computation.
From these parameters, you can derive pretty much all characteristics
of an interconnect without contention.