Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: George Bosilca (bosilca_at_[hidden])
Date: 2006-11-01 10:27:03


On Oct 28, 2006, at 6:51 PM, Tony Ladd wrote:

> George
>
> Thanks for the references. However, I was not able to figure out if
> it what
> I am asking is so trivial it is simply passed over or so subtle
> that its
> been overlooked (I suspect the former).

No. The answer to your question was in the articles. We have more
than just the Rabenseifner reduce and all-reduce algorithms. Some of
the most common collective communication calls have up to 15
different implementations in Open MPI. Of course, each of these
implementations give the best performance under some particular
conditions. Unfortunately, there is no unique algorithms that give
the best performance in all cases. As we have to deal with multiple
algorithms for each collective, we have to figure out which one is
better and where. This usually depend on the number of nodes in the
communicator, the message size as well as the network properties. In
few words, it's difficult to choose the best one without having prior
knowledge about the networks you're trying to use. This is something
we're working on right now on Open MPI. Until then ... It might
happens that for some particular points the performance of he
collective communications will not show the best possible
performance. However, to have a slow-down of a factor of 10 is quite
unbelievable. There might be something else going on there...

   Thanks,
     george.

PS: BTW which version of Open MPI are you using ? The one who deliver
the best performance or the collective communications (at least on
high performance networks) is the nightly release of he 1.2 branch.

> The binary tree algorithm in
> MPI_Allreduce takes a tiume proportional to 2*N*log_2M where N is
> the vector
> length and M is the number of processes. There is a divide and conquer
> strategy
> (http://www.hlrs.de/organization/par/services/models/mpi/
> myreduce.html) that
> mpich uses to do a MPI_Reduce in a time proportional to N. Is this
> algorithm
> or something equivalent in OpenMPI at present? If so how do I turn
> it on?
>
> I also found that OpenMPI is sometimes very slow on MPI_Allreduce
> using TCP.
> Things are OK up to 16 processes but at 24 the rates (Message
> length divided
> by time) are as follows:
>
> Message size (Kbytes) Throughput (Mbytes/sec)
> M=24 M=32 M=48
> 1 1.38 1.30 1.09
>
> 2 2.28 1.94 1.50
> 4 2.92 2.35 1.73
> 8 3.56 2.81 1.99
> 16 3.97 1.94 0.12
> 32 0.34 0.24 0.13
> 64 3.07 2.33 1.57
> 128 3.70 2.80 1.89
> 256 4.10 3.10 2.08
> 512 4.19 3.28 2.08
> 1024 4.36 3.36 2.17
>
> Around 16-32KBytes there is a pronouced slowdown-roughly a factor
> of 10,
> which seems too much. Any idea whats going on?
>
> Tony
>
> -------------------------------
> Tony Ladd
> Chemical Engineering
> University of Florida
> PO Box 116005
> Gainesville, FL 32611-6005
>
> Tel: 352-392-6509
> FAX: 352-392-9513
> Email: tladd_at_[hidden]
> Web: http://ladd.che.ufl.edu
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users