Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Josh Aune (ladros_at_[hidden])
Date: 2006-07-13 23:11:05

On 6/29/06, Patrick Geoffray <patrick_at_[hidden]> wrote:
> Jeff Squyres (jsquyres) wrote:
> >> -----Original Message-----
> >> From: devel-bounces_at_[hidden]
> >> [mailto:devel-bounces_at_[hidden]] On Behalf Of Patrick Geoffray
> >> Sent: Wednesday, June 28, 2006 1:23 PM
> >> To: Open MPI Developers
> >> Subject: Re: [OMPI devel] Best bw/lat performance for
> >> microbenchmark/debug utility
> >>
> >> Josh Aune wrote:
> >>> I am writing up some interconnect/network debugging software that is
> >>> centered around ompi. What is the best set of functions to
> > I was assuming that you would be testing latency/bandwidth, but Patrick
> > is correct in stating that there are many more things to test than just
> > those two metrics.
> There are a lot of metrics, but most of them require deep understanding
> of the MPI semantics and implementation details to make sense. The art
> of micro-benchmark is to choose the metrics and explain why they matter.
> It's obvious for latency/bandwidth, a bit less for unexpected and host
> overhead, definitively hard for overlap and progress. And that's just
> for point-to-point.
> To avoid reinventing the wheel, I would suggest to Josh to develop a
> micro-benchmark test suite to compute a very detailed LogP-derived
> parameters, ie for all message sizes:
> * send overhead (o.s) and recv overhead (o.r). These overheads will
> likely be either constant or linear for various message size ranges, it
> would be great to automatically compute the ranges.
> Memory registration cost is accounted here, so it would useful to
> measure with and without registration cache also.
> * Latency (L).
> * Send gap (g.s) and recv gap (g.r). For large messages, they will
> likely be identical and represent the link bandwidth. For smaller
> messages, the send gap is the gap of a fan-out pattern (1->N) and the
> recv gap is the gap of a flat gather (N->1). It's important to not have
> the send or recv overhead hiding the send or recv gap, using several
> processes could be used to dive the send/recv overhead.
> * unexpected overhead (o.u). Overhead added to (o.r) when the message is
> not immediately matched.
> * overlap availability (a) that is the percentage of communication time
> that you can overlap with real host computation.
> From these parameters, you can derive pretty much all characteristics
> of an interconnect without contention.
> Patrick

Sorry for the long delay in replying. Thanks for the info. What I am
trying to do is create a set of standardized easy to use system level
debugging utilites (and force myself to learn more MPI :). Currently
I am shooting for latency/bandwidth but would welcome ideas for
further useful node level tests. I am not just testing the
interconnect, but need to verify memory bandwidth, pci bandwidth to
the interconnect card (I love -mca btl ^sm :), processor
functionality, system errors (currently only parity and pci-express
fatal/nonfatal/etc) and what not.

I want to have tests that are easy enough to run all you have to do is
'mpirun -np $ALL ./footest' and it comes back with any nodes that look
bad for that test as well as some general data about the cluster's

I want to get the suite out to the comunity after I have some seed
tests written and hope that there will be enough that others will be
interested in contributing, though I am waiting for release approval
from work at the moment, which may not happen :(