|
|
Gilbert Grosdidier wrote:
Bonsoir Eugene,
Bon matin chez moi.
Here
follows some output for a 1024 core run.
Assuming this corresponds meaningfully with your original e-mail, 1024
cores means performance of 700 vs 900. So, that looks roughly
consistent with the 28% MPI time you show here. That seems to imply
that the slowdown is due entirely to long MPI times (rather than slow
non-MPI times). Just a sanity check.
Unfortunately, I'm yet unable to have the equivalent MPT chart.
That may be all right. If one run clearly shows a problem (which is
perhaps the case here), then a "good profile" is not needed. Here, a
"good profile" would perhaps be used only to confirm that near-zero MPI
time is possible.
#IPMv0.983####################################################################
# host : r34i0n0/x86_64_Linux mpi_tasks : 1024 on 128 nodes
# start : 12/21/10/13:18:09 wallclock : 3357.308618 sec
# stop : 12/21/10/14:14:06 %comm : 27.67
##############################################################################
#
# [total] <avg>
min max
# wallclock 3.43754e+06 3356.98 3356.83
3357.31
# user 2.82831e+06 2762.02 2622.04
2923.37
# system 376230 367.412 174.603
492.919
# mpi 951328 929.031 633.137
1052.86
# %comm 27.6719 18.8601
31.363
No glaring evidence here of load imbalance being the sole explanation,
but hard to tell from these numbers. (If min comm time is 0%, then
that process is presumably holding everyone else up.)
#
[time] [calls] <%mpi> <%wall>
# MPI_Waitall 741683 7.91081e+07 77.96
21.58
# MPI_Allreduce 114057 2.53665e+07
11.99 3.32
# MPI_Isend 27420.6 6.53513e+08
2.88 0.80
# MPI_Irecv 464.616 6.53513e+08
0.05 0.01
###############################################################################
It seems to my non-expert eye that MPI_Waitall is dominant among MPI
calls,
but not for the overall application,
If at 1024 cores, performance is 700 compared to 900, then whatever the
problem is still hasn't dominated the entire application performance.
So, it looks like MPI_Waitall is the problem, even if it doesn't
dominate overall application time.
Looks like on average each MPI_Waitall call is completing 8+ MPI_Isend
calls and 8+ MPI_Irecv calls. I think IPM gives some point-to-point
messaging information. Maybe you can tell what the distribution is of
message sizes, etc. Or, maybe you already know the characteristic
pattern. Does a stand-alone message-passing test (without the
computational portion) capture the performance problem you're looking
for?
Le
22/12/2010 18:50, Eugene Loh a écrit :
Can
you isolate a bit more where the time is being spent? The performance
effect you're describing appears to be drastic. Have you profiled the
code? Some choices of tools can be found in the FAQ http://www.open-mpi.org/faq/?category=perftools
The results may be "uninteresting" (all time spent in your MPI_Waitall
calls, for example), but it'd be good to rule out other possibilities
(e.g., I've seen cases where it's the non-MPI time that's the culprit).
If all the time is spent in MPI_Waitall, then I wonder if it would be
possible for you to reproduce the problem with just some
MPI_Isend|Irecv|Waitall calls that mimic your program. E.g., "lots of
short messages", or "lots of long messages", etc. It sounds like there
is some repeated set of MPI exchanges, so maybe that set can be
extracted and run without the complexities of the application.
|
|
|