This web mail archive is frozen.
This page is part of a frozen web archive of this mailing list.
You can still navigate around this archive, but know that no new mails
have been added to it since July of 2016.
Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.
I have instrumented my fortran code with "timers" in the following way:
start_0 = MPI_Wtime()
start_1 = MPI_Wtime()
end_1 = MPI_Wtime()
write(*,*) "timer1 = ",end1-start1
start_2 = MPI_Wtime()
end_2 = MPI_Wtime()
write(*,*) "timer2 = ",end2-start2
end_0 = MPI_Wtime()
write(*,*) "timer0 = ",end0-start0
When I run my code on a "small" number of processors, I find that timer0=timer1+timer2 with a very good precision (less than 1%).
However, as I increase the number of processors, this is not true any more: I can have 10%, 20% or even more discrepancy!
The more processor I use, the bigger errors are observed.
Obviously, my code is much bigger than the simple example above, but the principle is exactly the same.
Does anyone have an idea?
Of course, each processor writes its own timer in an individual file: the discrepancy is nearly the same on every processor.