Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Collective communications may be abend when it use over 2GiB buffer
From: N.M. Maclaren (nmm1_at_[hidden])
Date: 2012-03-05 16:37:38


On Mar 5 2012, George Bosilca wrote:
>
> I was afraid about all those little intermediary steps. I asked a
> compiler guy and apparently reversing the order (aka starting with the
> ptrdiff_t variable) will not solve anything. The only portable way to
> solve this is to cast every single member, to prevent __any__ compiler
> from hurting us.

That is true, but even that may not help, given that each version of
the C standard has been incompatible with its predecessors. And see
below.

>> In my copy of C99, section 6.5 Expressions says " the order of
>> evaluation of subexpressions and the order in which side effects take
>> place are both unspecified. There is a footnote 71 that "specifies the
>> precedence of operators in the evaluation of an expressions, which is
>> the same as the order of the major subclauses of this subclause, highest
>> precedence first." It is the footnote that implies multiplication (6.5.5
>> Multiplicative operators) has higher precedence than addition (6.5.6
>> Additive operators) in the expression "(char*) rbuf + rank * rcount *
>> rext". But, the main text states that there is no ordering of the
>> subexpression "rank * rcount * rext". When the compiler chooses to
>> evaluate "rank * rcount" first, the overflow described by Yuki can
>> result. I think you are correct that the subexpression will get promoted
>> to (ptrdiff_t), but that is not quite the same thing.

No, it's not as simple as that :-(

That was the intent during the standardisation of C90, but those of
us who tried failed to get any explicit statement into it, and the
situation during C99 was that "but everybody knows that" the syntax
rules also define the evaluation order. We failed to get that stated
then, either :-( That interpretation was apparently also the one
assumed by C++03, too, and now is explicitly (if informally) stated in
C++11. So you theoretically can just cast the first operand to the
maximum precision and it will all work.

What it means by the "order of evaluation of subexpressions" is that
the assignments in '(a = b) + (c = d) + (e = f)' can take place in
any order, which is a different issue.

HOWEVER, about half of the C communities have given C99 the thumbs
down, I doubt that C11 will be taken much notice of, gcc is the
de facto standard definer, and most compilers have optimisation
options that say "ignore the standard when it helps to go faster".
So the only feasible rule is to do your damnedest to defend yourself
against the aberrations, ambiguities and inconsistencies of C, and
hope for the best. I.e. what George recommends.

But will even that work reliably in the medium term? I wouldn't
bet on it :-(

Regards,
Nick Maclaren.