Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [OMPI svn] svn:open-mpi r25323
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2011-10-19 14:09:20


Oy, yes, that is bad -- we cannot have overlapping ORTE and OMPI error codes. That seems like a very bad idea (in addition to the mixing of + and -).

For one thing, that breaks opal_strerror(). That, in itself, seems like a dealbreaker.

On Oct 19, 2011, at 1:51 PM, Barrett, Brian W wrote:

> I actually think it's worse than that. An ORTE error code can now have
> the same error code as an OMPI error. OMPI_ERR_REQUEST and
> ORTE_ERR_RECV_LESS_THANK_POSTED now share the same integer return code.
> Or, they should, if George hadn't made a mistake (see below). The sharing
> of return codes seems... bad.
>
> Also, there's a bug in George's patch. Error codes are all negative, so
> OMPI_ERR_REQUEST should be OMPI_ERR_BASE -1 and OMPI_ERR_MAX should be
> OMPI_ERR_BASE - 1, not plus 2.
>
> Brian
>
> On 10/19/11 1:32 PM, "Ralph Castain" <rhc_at_[hidden]> wrote:
>
>> I've been wrestling with something from this commit, and I'm unsure of
>> the right answer. So please consider this a general design question for
>> the community.
>>
>> This commit removes all the OMPI <-> ORTE equivalent constants - i.e., we
>> used to declare OMPI-prefixed equivalents to every ORTE-prefixed
>> constant. I understand the thinking (or at least, what I suspect was the
>> thought), but it creates an issue.
>>
>> Suppose I have an ompi-level function (A) that calls another ompi-level
>> function (B). Invisible to A is that B calls an orte-level function. B
>> dutifully checks the error return from the orte-level function against an
>> ORTE-prefixed constant.
>>
>> However, if that return isn't "success", what does B return up to A? It
>> cannot return the OMPI equivalent to the orte error constant because it
>> no longer exists. It could return the orte error code, but A has no way
>> of knowing it is going to get a non-OMPI constant, and therefore won't be
>> able to understand it - it will be an "unrecognized error".
>>
>> I guess one option is to require that B "translate" the return code and
>> pass some OMPI error up the chain, but this prevents anything upwards
>> from understanding the nature of the problem and potentially taking
>> corrective and/or alternative action. Seems awfully limiting, as most of
>> the time the only option will be the vanilla "OMPI_ERROR".
>>
>> Thoughts?
> --
> Brian W. Barrett
> Dept. 1423: Scalable System Software
> Sandia National Laboratories
>
>
>
>
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/