Dear OpenMPI users,
i'm using OpenMPI 1.3.3 on Infiniband 4x interconnnection network. My parallel application use intensive MPI_Reduce communication over communicator created with MPI_Comm_split.
I've noted strange behaviour during execution. My code is instrumented with Scalasca 1.3 to report subroutine execution time. First execution shows elapsed time with 128 processors ( job_communicator is created with MPI_Comm_split). In both cases is composed to the same ranks of MPI_COMM_WORLD:
MPI_Reduce(.....,job_communicator)
The elapsed time is 2671 sec.
Second run use MPI_BARRIER before MPI_Reduce:
MPI_Barrier(job_communicator..)
MPI_Reduce(.....,job_communicator)
The elapsed time of Barrier+Reduce is 2167 sec, (about 8 minutes less).
So, im my opinion, it is better put MPI_Barrier before any MPI_Reduce to mitigate "asynchronous" behaviour of MPI_Reduce in OpenMPI. I suspect the same for others collective communications. Someone can explaine me why MPI_reduce has this strange behaviour?
Thanks in advance.
--
Ing. Gabriele Fatigati
Parallel programmer
CINECA Systems & Tecnologies Department
Supercomputing Group
Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
www.cineca.it Tel: +39 051 6171722
g.fatigati [AT]
cineca.it