Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] failure withzero-lengthReduce()andbothsbuf=rbuf=NULL
From: George Bosilca (bosilca_at_[hidden])
Date: 2010-02-11 10:04:27


On Feb 11, 2010, at 07:10 , Jeff Squyres (jsquyres) wrote:

> I misparsed your reply. Yes, bcast(1) *can* sync if it wants to. I don't have a spec handy to check if bcast(0) is defined or not (similar to reduce). If it is, then sure, it could sync as well.

I have to disagree here. There are no synchronization in MPI except MPI_Barrier. At best, a bcast(1) is a one way synchronization, as the only knowledge it gives to any rank (except root) is that the root has reached the bcast. No assumptions about the other ranks should be made, as this is strongly dependent on the underlying algorithm, and the upper level do not have a way to know which algorithm is used. Similarly, a reduce(1) is the opposite of the bcast(1), the only certain thing is at the root and it means all other ranks had reached the reduce(1).

Therefore, we can argue as much as you want about what the correct arguments of a reduce call should be, a reduce(count=0) is one of the meaningless MPI calls and as such should not be tolerated.

Anyway, this discussion diverged from its original subject. The standard is pretty clear on what set of arguments are valid, and the fact that the send and receive buffers should be different is one of the strongest requirement (and this independent on what count is). As a courtesy, Open MPI accepts the heresy of a count = zero, but there is __absolutely__ no reason to stop checking the values of the other arguments when this is true. If the user really want to base the logic of his application on such a useless and non-standard statement (reduce(0)) at least he has to have the courtesy to provide a valid set of arguments.

  george.

PS: If I can suggest a correct approach to fix the python bindings I would encourage you to go for the strongest and more meaningful approach, sendbuf should always be different that recvbuf (independent on the value of count).

 

> My previous point was that barrier is the only collective that is *required* to synchronize.
>
> -jms
> Sent from my PDA. No type good.
>
> From: devel-bounces_at_[hidden] <devel-bounces_at_[hidden]>
> To: devel_at_[hidden] <devel_at_[hidden]>
> Sent: Thu Feb 11 07:04:59 2010
> Subject: Re: [OMPI devel] failure withzero-lengthReduce()andbothsbuf=rbuf=NULL
>
> Where does bcast(1) synchronize?
>
> (Oops on typo - if reduce(1) wasn't defined, that definitely would be bad :) )
>
> -jms
> Sent from my PDA. No type good.
>
> ----- Original Message -----
> From: devel-bounces_at_[hidden] <devel-bounces_at_[hidden]>
> To: Open MPI Developers <devel_at_[hidden]>
> Sent: Wed Feb 10 12:50:03 2010
> Subject: Re: [OMPI devel] failure with zero-lengthReduce()andbothsbuf=rbuf=NULL
>
> On 10 February 2010 14:19, Jeff Squyres <jsquyres_at_[hidden]> wrote:
> > On Feb 10, 2010, at 11:59 AM, Lisandro Dalcin wrote:
> >
> >> > If I remember correctly, the HPCC pingpong test synchronizes occasionally by
> >> > having one process send a zero-byte broadcast to all other processes.
> >> > What's a zero-byte broadcast? Well, some MPIs apparently send no data, but
> >> > do have synchronization semantics. (No non-root process can exit before the
> >> > root process has entered.) Other MPIs treat the zero-byte broadcasts as
> >> > no-ops; there is no synchronization and then timing results from the HPCC
> >> > pingpong test are very misleading. So far as I can tell, the MPI standard
> >> > doesn't address which behavior is correct.
> >>
> >> Yep... for p2p communication things are more clear (and behavior more
> >> consistens in the MPI's out there) regarding zero-length messages...
> >> IMHO, collectives should be non-op only in the sense that no actual
> >> reduction is made because there are no elements to operate on. I mean,
> >> if Reduce(count=1) implies a sync, Reduce(count=0) should also imply a
> >> sync...
> >
> > Sorry to disagree again. :-)
> >
> > The *only* MPI collective operation that guarantees a synchronization is barrier. The lack of synchronization guarantee for all other collective operations is very explicit in the MPI spec.
>
> Of course.
>
> > Hence, it is perfectly valid for an MPI implementation to do something like a no-op when no data transfer actually needs to take place
> >
>
> So you say that an MPI implementation is free to do make a sync in
> case of Bcast(count=1), but not in the case of Bcast(count=0) ? I
> could agree that such behavior is technically correct regarding the
> MPI standard... But it makes me feel a bit uncomfortable... OK, in the
> end, the change on semantic depending on message sizes is comparable
> to the blocking/nonblocking one for MPI_Send(count=10^8) versus
> Send(count=1).
>
> >
> > (except, of course, the fact that Reduce(count=1) isn't defined ;-) ).
> >
>
> You likely meant Reduce(count=0) ... Good catch ;-)
>
>
> PS: The following question is unrelated to this thread, but my
> curiosity+laziness cannot resist... Does Open MPI has some MCA
> parameter to add a synchronization at every collective call?
>
> --
> Lisandro Dalcin
> ---------------
> Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
> Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
> Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
> PTLC - Güemes 3450, (3000) Santa Fe, Argentina
> Tel/Fax: +54-(0)342-451.1594
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel