Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [OMPI svn] svn:open-mpi r25323
From: Barrett, Brian W (bwbarre_at_[hidden])
Date: 2011-10-19 13:51:13


I actually think it's worse than that. An ORTE error code can now have
the same error code as an OMPI error. OMPI_ERR_REQUEST and
ORTE_ERR_RECV_LESS_THANK_POSTED now share the same integer return code.
Or, they should, if George hadn't made a mistake (see below). The sharing
of return codes seems... bad.

Also, there's a bug in George's patch. Error codes are all negative, so
OMPI_ERR_REQUEST should be OMPI_ERR_BASE -1 and OMPI_ERR_MAX should be
OMPI_ERR_BASE - 1, not plus 2.

Brian

On 10/19/11 1:32 PM, "Ralph Castain" <rhc_at_[hidden]> wrote:

>I've been wrestling with something from this commit, and I'm unsure of
>the right answer. So please consider this a general design question for
>the community.
>
>This commit removes all the OMPI <-> ORTE equivalent constants - i.e., we
>used to declare OMPI-prefixed equivalents to every ORTE-prefixed
>constant. I understand the thinking (or at least, what I suspect was the
>thought), but it creates an issue.
>
>Suppose I have an ompi-level function (A) that calls another ompi-level
>function (B). Invisible to A is that B calls an orte-level function. B
>dutifully checks the error return from the orte-level function against an
>ORTE-prefixed constant.
>
>However, if that return isn't "success", what does B return up to A? It
>cannot return the OMPI equivalent to the orte error constant because it
>no longer exists. It could return the orte error code, but A has no way
>of knowing it is going to get a non-OMPI constant, and therefore won't be
>able to understand it - it will be an "unrecognized error".
>
>I guess one option is to require that B "translate" the return code and
>pass some OMPI error up the chain, but this prevents anything upwards
>from understanding the nature of the problem and potentially taking
>corrective and/or alternative action. Seems awfully limiting, as most of
>the time the only option will be the vanilla "OMPI_ERROR".
>
>Thoughts?

-- 
  Brian W. Barrett
  Dept. 1423: Scalable System Software
  Sandia National Laboratories