Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] failure with zero-length Reduce() andbothsbuf=rbuf=NULL
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2010-02-10 09:48:22


On Feb 10, 2010, at 8:40 AM, Lisandro Dalcín wrote:

> > Note that from a standards perspective, note that MPI_REDUCE *does* require at least one element -- MPI-2.2 p163:34-35:
> >
> > "Each process can provide one element, or a sequence of elements..."
>
> Are you really convinced that such sentence means that zero elements
> is illegal?

As Bill Gropp would say, "there is no legal and illegal -- there is only what is defined by the spec." :-)

The text defines that MPI_REDUCE is supposed to be called with one or more elements. It does not define what happens when zero elements are used. It is therefore undefined what happens. And therefore not portable. Some MPI's may allow it; some may not. MPI programmer beware.

> I have the feeling that this corner case was not taken
> into account at the time that wording was written (wich dates back to
> MPI 1.1 standard).
>
> Is there a rationale for requiring at least one element? Is this worth
> a change/clarification in the MPI standard?

The Forum has been historically resistant to syntactic sugar. Arguably, you could have a correct program by adding an if statement:

    if (count > 0) MPI_Reduce(...)

More specifically: MPI's core functionality revolves around message passing, not providing no-ops. I feel quite comfortable stating that if you want a no-op, do it in the application (e.g., via an "if" statement). Put simply: if you don't want a reduction, don't call MPI_REDUCE.

> > So I think that George's assertion is correct: your test code is incorrect.
>
> Well, you have to grant me that a zero-length reduction seems
> something plausible to test. I still think OMPI is following too
> strictly the wording "Each process can provide one element". Again,
> this sentence comes from MPI-1.1 .

Er... much of the wording in MPI-2.2 comes from MPI-1.0. :-) This one sentence is no different than thousands of others.

> Please, do not take me wrong. If there is an actual issue with
> zero-length reductions, I want to know about it. Otherwise, I would
> like to ask you to revisit OMPI behavior. I'm still thinking that
> there is no good reason for zero-length reductions to invalid
> operations, they should be just non-op calls.

You still have to pass a bunch of other stuff to make MPI_REDUCE not cause an MPI exception (such as a valid datatype, etc.). Why is count>0 any different?

> > But that's not what is causing your example to fail. Here's the issue in OMPI's MPI_Reduce:
> >
> > } else if ((ompi_comm_rank(comm) != root && MPI_IN_PLACE == sendbuf) ||
> > (ompi_comm_rank(comm) == root && ((MPI_IN_PLACE == recvbuf) || (sendbuf == recvbuf)))) {
> > err = MPI_ERR_ARG;
> >
> > The "sendbuf == recvbuf" check is what causes the MPI exception. I would say that we're not consistent about disallowing that (e.g., such checks are not in MPI_SCAN and the others you cited).
>
> Yes, I understand that. But in the case that zero-length reductions
> were valid, the check should not fall-back there...

Per my above statements, I don't agree with your implication here. :-)

And also remember that OMPI *does* allow zero-length reductions, but only because we were bludgeoned into it. So there is no "fall-back" to the buffer test -- the buffer test is orthogonal to the count test because we allow count==0.

> But NULL is a very special case. Using (ptr=NULL,len=0) for
> zero-length arrays is common out there.

Let's be clear: the problem is not that your buffers are NULL. It's the fact that sendbuf==recvbuf in the call to MPI_REDUCE, regardless of whether they are NULL or something else.

> In short, I still think that (sendbuf=NULL,recvbuf=NULL,count=0)
> should work. Not sure about
> (sendbuf=(void*)1,recvbuf=(void*)1,count=0) , but I can imagine cases
> were this would be nice to have (e.g. some dynamic language, or
> library, or even user code that employs a singleton for zero-length
> arrays)

We don't test pointers for any particular value other than named constants (e.g., MPI_IN_PLACE) because any pointer value could point to a valid buffer when paired with an appropriate datatype.

As such, NULL is *not* a special case. It's a potentially valid buffer, just like any other value.

> Special casing Open MPI in my testsuite to disable these tests is just
> a matter of adding two lines, but before that I would like to have
> some sort of final pronouncement on all this from your side.

What is the purpose of testing 0-length reductions?

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/