Most of the microbenchmarks that I've seen used send/receive because it
allows the MPI to optimize if it can (e.g., blocking to allow the
OS/hardware to make progress). If nothing else, it prevents an
additional traversal of the call stack (MPI_SEND will be done when it
returns; with MPI_ISEND, you have to traverse the call stack a second
time with MPI_WAIT. This may or may not be inconsequential)
But that being said, these are fantastically complicated issues :-).
Since these are so similar from a user API perspective, it may be
worthwhile to implement them both and allow switching between them
(perhaps at run-time) to see if there is a noticeable difference.
I'd strongly encourage looking at other microbenchmarks out there
(netpipe, netperf, presta, the IMB, etc.) to see what they have done,
not just in terms of SEND vs. ISEND, but also in terms of benchmarking
techniques. Microbenchmarks are tricky to get "just right."
Are you trying to make a suite that encompasses a bunch of performance
numbers that are not encompassed by other, existing microbenchmarks?
> -----Original Message-----
> From: devel-bounces_at_[hidden]
> [mailto:devel-bounces_at_[hidden]] On Behalf Of Josh Aune
> Sent: Wednesday, June 28, 2006 12:43 AM
> To: Open MPI Developers
> Subject: [OMPI devel] Best bw/lat performance for
> I am writing up some interconnect/network debugging software that is
> centered around ompi. What is the best set of functions to use to get
> the best bandwidth and latency numbers for openmpi and why? I've been
> asking around at work and some people say just send/recieve, though
> some of the micro benchmarks I have looked at in the past used
> isend/irecv. Can someone shed some light on this (or propose more
> devel mailing list