Is it MPI_ALLTOALL or MPI_ALLTOALLV that runs slower? If it is the latter,
the reason could be that the default implementation of MPI_ALLTOALLV in
1.6.5 is different from that in 1.5.4. To switch back to the previous one,
--mca coll_tuned_use_dynamic_rules 1 --mca coll_tuned_alltoallv_algorithm 1
The logic that selects the MPI_ALLTOALL implementation is the same in both
versions, although the pairwise implementation in 1.6.5 is a bit different.
The difference should have negligible effects though.
Note that coll_tuned_use_dynamic_rules has to be enabled in order for MCA
parameters that allows you to select the algorithms to be registered.
Therefore you have use ompi_info as follows:
ompi_info --mca coll_tuned_use_dynamic_rules 1 --param coll tuned
Hope that helps!
> -----Original Message-----
> From: users [mailto:users-bounces_at_[hidden]] On Behalf Of Dave Love
> Sent: Friday, October 18, 2013 6:18 PM
> To: users_at_[hidden]
> Subject: [OMPI users] debugging performance regressions between versions
> I've been testing an application that turns out to be ~30% slower with
> 1.6.5 than (the Red Hat packaged version of) 1.5.4, with the same mca-
> params and the same binary, just flipping the runtime. It's running over
> openib, and the profile it prints says that alltoall is a factor of four
> 1.6.5. (I haven't tried to profile it externally, but I've no reason to
> it says.)
> How should I go about finding out why and -- I hope -- fixing it?
> A possibly relevant side question: Is there a way of dumping all the MCA
> parameters in effect? ompi_info --all doesn't show collective algorithms,
> instance, though I thought I'd got those out of it at one time.
> users mailing list
Hristo Iliev, PhD - High Performance Computing Team
RWTH Aachen University, Center for Computing and Communication
Rechen- und Kommunikationszentrum der RWTH Aachen
Seffenter Weg 23, D 52074 Aachen (Germany)
- application/pkcs7-signature attachment: smime.p7s