Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Problem with MPI_BARRIER
From: Jai Dayal (dayalsoap_at_[hidden])
Date: 2011-09-08 10:27:36


what tick value are you using (i.e., what units are you using?)

On Thu, Sep 8, 2011 at 10:25 AM, Ghislain Lartigue <
ghislain.lartigue_at_[hidden]> wrote:

> Thanks,
>
> I understand this but the delays that I measure are huge compared to a
> classical ack procedure... (1000x more)
> And this is repeatable: as far as I understand it, this shows that the
> network is not involved.
>
> Ghislain.
>
>
> Le 8 sept. 2011 à 16:16, Teng Ma a écrit :
>
> > I guess you forget to count the "leaving time"(fan-out). When everyone
> > hits the barrier, it still needs "ack" to leave. And remember in most
> > cases, leader process will send out "acks" in a sequence way. It's very
> > possible:
> >
> > P0 barrier time = 29 + send/recv ack 0
> > P1 barrier time = 14 + send ack 0 + send/recv ack 1
> > P2 barrier time = 0 + send ack 0 + send ack 1 + send/recv ack 2
> >
> > That's your measure time.
> >
> > Teng
> >> This problem as nothing to do with stdout...
> >>
> >> Example with 3 processes:
> >>
> >> P0 hits barrier at t=12
> >> P1 hits barrier at t=27
> >> P2 hits barrier at t=41
> >>
> >> In this situation:
> >> P0 waits 41-12 = 29
> >> P1 waits 41-27 = 14
> >> P2 waits 41-41 = 00
> >
> >
> >
> >> So I should see something like (no ordering is expected):
> >> barrier_time = 14
> >> barrier_time = 00
> >> barrier_time = 29
> >>
> >> But what I see is much more like
> >> barrier_time = 22
> >> barrier_time = 29
> >> barrier_time = 25
> >>
> >> See? No process has a barrier_time equal to zero !!!
> >>
> >>
> >>
> >> Le 8 sept. 2011 à 14:55, Jeff Squyres a écrit :
> >>
> >>> The order in which you see stdout printed from mpirun is not
> necessarily
> >>> reflective of what order things were actually printers. Remember that
> >>> the stdout from each MPI process needs to flow through at least 3
> >>> processes and potentially across the network before it is actually
> >>> displayed on mpirun's stdout.
> >>>
> >>> MPI process -> local Open MPI daemon -> mpirun -> printed to mpirun's
> >>> stdout
> >>>
> >>> Hence, the ordering of stdout can get transposed.
> >>>
> >>>
> >>> On Sep 8, 2011, at 8:49 AM, Ghislain Lartigue wrote:
> >>>
> >>>> Thank you for this explanation but indeed this confirms that the LAST
> >>>> process that hits the barrier should go through nearly instantaneously
> >>>> (except for the broadcast time for the acknowledgment signal).
> >>>> And this is not what happens in my code : EVERY process waits for a
> >>>> very long time before going through the barrier (thousands of times
> >>>> more than a broadcast)...
> >>>>
> >>>>
> >>>> Le 8 sept. 2011 à 14:26, Jeff Squyres a écrit :
> >>>>
> >>>>> Order in which processes hit the barrier is only one factor in the
> >>>>> time it takes for that process to finish the barrier.
> >>>>>
> >>>>> An easy way to think of a barrier implementation is a "fan in/fan
> out"
> >>>>> model. When each nonzero rank process calls MPI_BARRIER, it sends a
> >>>>> message saying "I have hit the barrier!" (it usually sends it to its
> >>>>> parent in a tree of all MPI processes in the communicator, but you
> can
> >>>>> simplify this model and consider that it sends it to rank 0). Rank 0
> >>>>> collects all of these messages. When it has messages from all
> >>>>> processes in the communicator, it sends out "ok, you can leave the
> >>>>> barrier now" messages (again, it's usually via a tree distribution,
> >>>>> but you can pretend that it directly, linearly sends a message to
> each
> >>>>> peer process in the communicator).
> >>>>>
> >>>>> Hence, the time that any individual process spends in the
> communicator
> >>>>> is relative to when every other process enters the communicator. But
> >>>>> it's also dependent upon communication speed, congestion in the
> >>>>> network, etc.
> >>>>>
> >>>>>
> >>>>> On Sep 8, 2011, at 6:20 AM, Ghislain Lartigue wrote:
> >>>>>
> >>>>>> Hello,
> >>>>>>
> >>>>>> at a given point in my (Fortran90) program, I write:
> >>>>>>
> >>>>>> ===================
> >>>>>> start_time = MPI_Wtime()
> >>>>>> call MPI_BARRIER(...)
> >>>>>> new_time = MPI_Wtime() - start_time
> >>>>>> write(*,*) "barrier time =",new_time
> >>>>>> ==================
> >>>>>>
> >>>>>> and then I run my code...
> >>>>>>
> >>>>>> I expected that the values of "new_time" would range from 0 to Tmax
> >>>>>> (1700 in my case)
> >>>>>> As I understand it, the first process that hits the barrier should
> >>>>>> print Tmax and the last process that hits the barrier should print 0
> >>>>>> (or a very low value).
> >>>>>>
> >>>>>> But this is not the case: all processes print values in the range
> >>>>>> 1400-1700!
> >>>>>>
> >>>>>> Any explanation?
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Ghislain.
> >>>>>>
> >>>>>> PS:
> >>>>>> This small code behaves perfectly in other parts of my code...
> >>>>>>
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> users mailing list
> >>>>>> users_at_[hidden]
> >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Jeff Squyres
> >>>>> jsquyres_at_[hidden]
> >>>>> For corporate legal information go to:
> >>>>> http://www.cisco.com/web/about/doing_business/legal/cri/
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> users mailing list
> >>>>> users_at_[hidden]
> >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>>
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> users mailing list
> >>>> users_at_[hidden]
> >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>
> >>>
> >>> --
> >>> Jeff Squyres
> >>> jsquyres_at_[hidden]
> >>> For corporate legal information go to:
> >>> http://www.cisco.com/web/about/doing_business/legal/cri/
> >>>
> >>>
> >>> _______________________________________________
> >>> users mailing list
> >>> users_at_[hidden]
> >>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>
> >>
> >>
> >> _______________________________________________
> >> users mailing list
> >> users_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >
> >
> > | Teng Ma Univ. of Tennessee |
> > | tma_at_[hidden] Knoxville, TN |
> > | http://web.eecs.utk.edu/~tma/ |
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>