3 months ago I opened a ticket about an extra local data copy being made in the pairwise alltoallv implementation in the “tuned” module that can hurt performance in some cases:
As far as I can see the milestone was set to Open MPI 1.6.1 and although it was quite trivial to fix (and I have submitted the appropriate patch with the ticket), the defect is still present in the latest revision of the 1.6 branch and also in trunk. Given that in most cluster cases OMPI ends up using “tuned” and that 1.6.1rc1 makes the pairwise algorithm the default, shouldn’t this defect have been fixed by now?
Hristo Iliev, Ph.D. -- High Performance Computing
RWTH Aachen University, Center for Computing and Communication
Rechen- und Kommunikationszentrum der RWTH Aachen
Seffenter Weg 23, D 52074 Aachen (Germany)
Tel: +49 241 80 24367 -- Fax/UMS: +49 241 80 624367