Le 15 sept. 2011 à 16:37, Eugene Loh a écrit :
> On 9/15/2011 5:51 AM, Ghislain Lartigue wrote:
>> start_0 = MPI_Wtime()
>> start_1 = MPI_Wtime()
>> call foo()
>> end_1 = MPI_Wtime()
>> write(*,*) "timer1 = ",end1-start1
>> start_2 = MPI_Wtime()
>> call bar()
>> end_2 = MPI_Wtime()
>> write(*,*) "timer2 = ",end2-start2
>> end_0 = MPI_Wtime()
>> write(*,*) "timer0 = ",end0-start0
>> When I run my code on a "small" number of processors, I find that timer0=timer1+timer2 with a very good precision (less than 1%).
>> However, as I increase the number of processors, this is not true any more: I can have 10%, 20% or even more discrepancy!
>> The more processor I use, the bigger errors are observed.
>> Obviously, my code is much bigger than the simple example above, but the principle is exactly the same.
> In the simple example, if timer0 is much bigger than timer1+timer2, we'd be inclined to attribute extra time to the timer calls or the write statements... in any case, to time spent between end_1 and start_2 or between end_2 and end_0.
=> no this can not be, the time spent in the timers and the write operation is very small compared to the overall code (as indicated by the result on 1proc)
> Are you sure in the actual code there are no substantial operations in those sections?
=> I am sure of that (same reason as above)
> Also, is it possible your processes are not running during some of those times?
=> No idea... What do you have in mind precisely?
> Are you oversubscribing?
=> No way...
> Also, instead of printing out endX-startX, how about writing out endX and startX individually so you get all six timestamps and can see in greater detail where the discrepancy is arising.
=> This is a good idea: I will try that...
> users mailing list