3 months ago I opened a ticket about an extra local data copy being made in the pairwise alltoallv implementation in the “tuned” module that can hurt performance in some cases:




As far as I can see the milestone was set to Open MPI 1.6.1 and although it was quite trivial to fix (and I have submitted the appropriate patch with the ticket), the defect is still present in the latest revision of the 1.6 branch and also in trunk. Given that in most cluster cases OMPI ends up using “tuned” and that 1.6.1rc1 makes the pairwise algorithm the default, shouldn’t this defect have been fixed by now?


