|
|
Gilbert Grosdidier wrote:
Any other suggestion ?
Can any more information be extracted from profiling? Here is where I
think things left off:
Eugene Loh wrote:
Gilbert Grosdidier wrote:
#
[time] [calls] <%mpi> <%wall>
# MPI_Waitall 741683 7.91081e+07 77.96
21.58
# MPI_Allreduce 114057 2.53665e+07
11.99 3.32
# MPI_Isend 27420.6 6.53513e+08
2.88 0.80
# MPI_Irecv 464.616 6.53513e+08
0.05 0.01
###############################################################################
It seems to my non-expert eye that MPI_Waitall is dominant among MPI
calls,
but not for the overall application,
Looks like on average each MPI_Waitall call is completing 8+ MPI_Isend
calls and 8+ MPI_Irecv calls. I think IPM gives some point-to-point
messaging information. Maybe you can tell what the distribution is of
message sizes, etc. Or, maybe you already know the characteristic
pattern. Does a stand-alone message-passing test (without the
computational portion) capture the performance problem you're looking
for?
Do you know message lengths and patterns? Can you confirm whether
non-MPI time is the same between good and bad runs?
|
|
|