Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Tony Ladd (ladd_at_[hidden])
Date: 2006-10-28 18:51:59


George

Thanks for the references. However, I was not able to figure out if it what
I am asking is so trivial it is simply passed over or so subtle that its
been overlooked (I suspect the former). The binary tree algorithm in
MPI_Allreduce takes a tiume proportional to 2*N*log_2M where N is the vector
length and M is the number of processes. There is a divide and conquer
strategy
(http://www.hlrs.de/organization/par/services/models/mpi/myreduce.html) that
mpich uses to do a MPI_Reduce in a time proportional to N. Is this algorithm
or something equivalent in OpenMPI at present? If so how do I turn it on?

I also found that OpenMPI is sometimes very slow on MPI_Allreduce using TCP.
Things are OK up to 16 processes but at 24 the rates (Message length divided
by time) are as follows:

Message size (Kbytes) Throughput (Mbytes/sec)
                                        M=24 M=32 M=48
        1 1.38 1.30 1.09

        2 2.28 1.94 1.50
        4 2.92 2.35 1.73
        8 3.56 2.81 1.99
        16 3.97 1.94 0.12
        32 0.34 0.24 0.13
        64 3.07 2.33 1.57
        128 3.70 2.80 1.89
        256 4.10 3.10 2.08
        512 4.19 3.28 2.08
        1024 4.36 3.36 2.17

Around 16-32KBytes there is a pronouced slowdown-roughly a factor of 10,
which seems too much. Any idea whats going on?

Tony

-------------------------------
Tony Ladd
Chemical Engineering
University of Florida
PO Box 116005
Gainesville, FL 32611-6005

Tel: 352-392-6509
FAX: 352-392-9513
Email: tladd_at_[hidden]
Web: http://ladd.che.ufl.edu