Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] all2all algorithms
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2009-04-14 13:55:10


George can speak more definitively about this.

In general, our "tuned" coll component (plugin) does exactly these
kinds of determinations to figure out which algorithm to use at
runtime. Not only are communicator process counts involved, but also
size of message is considered. I count 5 different all2all algorithms
in our tuned module (but George will have to speak about how each one
is chosen). U. Tennessee published some papers on their method; they
basically hard-coded minimized selection tables based on oodles of
runs and empirical data.

If you'd like to look at the code, it's here in the tree:

     https://svn.open-mpi.org/source/xref/ompi_1.3/ompi/mca/coll/tuned/coll_tuned_alltoall.c
     (v1.3 release branch)

     https://svn.open-mpi.org/source/xref/ompi-trunk/ompi/mca/coll/tuned/coll_tuned_alltoall.c
     (development trunk)

I don't think there's much difference between the two, but they can
drift since v1.3 is the release branch and the trunk is active
development.

On Apr 12, 2009, at 3:20 PM, Tom Rosmond wrote:

> I am curious about the algorithm(s) used in the OpenMPI
> implementations
> of the all2all and all2allv. As many of you know, there are alternate
> algorithms for all2all type operations, such as that of Plimpton, et
> al
> (2006), that basically exchange latency costs for bandwidth costs,
> which
> pays big dividends for large processor numbers, e.g. 100's or 1000's.
> Does OpenMPI, or any other MPI distributions, test for processor count
> and switch to such an all2all algorithm at some point? I realize the
> switchover point would be very much a function of the architecture,
> and
> so could be a risky decision in some cases. Nevertheless, has it been
> considered?
>

-- 
Jeff Squyres
Cisco Systems