George can speak more definitively about this.
In general, our "tuned" coll component (plugin) does exactly these
kinds of determinations to figure out which algorithm to use at
runtime. Not only are communicator process counts involved, but also
size of message is considered. I count 5 different all2all algorithms
in our tuned module (but George will have to speak about how each one
is chosen). U. Tennessee published some papers on their method; they
basically hard-coded minimized selection tables based on oodles of
runs and empirical data.
If you'd like to look at the code, it's here in the tree:
(v1.3 release branch)
I don't think there's much difference between the two, but they can
drift since v1.3 is the release branch and the trunk is active
On Apr 12, 2009, at 3:20 PM, Tom Rosmond wrote:
> I am curious about the algorithm(s) used in the OpenMPI
> of the all2all and all2allv. As many of you know, there are alternate
> algorithms for all2all type operations, such as that of Plimpton, et
> (2006), that basically exchange latency costs for bandwidth costs,
> pays big dividends for large processor numbers, e.g. 100's or 1000's.
> Does OpenMPI, or any other MPI distributions, test for processor count
> and switch to such an all2all algorithm at some point? I realize the
> switchover point would be very much a function of the architecture,
> so could be a risky decision in some cases. Nevertheless, has it been