Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r26172
From: Josh Hursey (jjhursey_at_[hidden])
Date: 2012-03-22 07:51:31


Fair enough. Upon further inspection of the request_invoke() handler,
you are correct that it is not required here if we do not modify the
default value for req_status.MPI_ERROR.

I'll work on a revised patch this morning and commit. One that does
not use this field.

Per your comment from your first message:
  "The usage we're doing in the default_wait_all of MPI_ERR_PENDING is
incorrect as well."
I'm glad that you no longer think the patch is "incorrect", but just
not implemented as well as it could be.

Thanks,
Josh

On Thu, Mar 22, 2012 at 1:40 AM, George Bosilca <bosilca_at_[hidden]> wrote:
>
> On Mar 21, 2012, at 15:13 , Josh Hursey wrote:
>
>> I see your point about setting MPI_ERR_PENDING on the internal status
>> versus the status returned by MPI_Waitall. As I mentioned, the reason
>> I choose to do that is to support the ompi_errhandler_request_invoke()
>> function. I could not think of an better way to fix this, so I'm open
>> to ideas.
>
> If MPI_ERR_PENDING is not set on the request status then I don't see any issues with the ompi_errhandler_request_invoke. Do I miss something obvious?
>
>>
>> MPI_Waitall returns MPI_ERR_PENDING for 'not completed' requests (or
>> sets the exposed status.MPI_ERROR field appropriately, depending on
>> the completion function used). The user must still call a completion
>> function to complete the request at which point the true error value
>> is reported (MPI_SUCCESS, or something else). So I think we are on the
>> same page here.
>>
>> Setting the request->req_status.MPI_ERROR to MPI_ERR_PENDING in
>> MPI_Waitall does not prevent the OMPI system from overwriting that
>> value with the correct error (e.g., MPI_ERR_TRUNCATE). When the OMPI
>> system wants to set an error on the request it sets the value (without
>> checking the previous value in req_status.MPI_ERROR) before calling
>> the completion operation. We reset the value from MPI_ERR_PENDING to
>> MPI_SUCCESS at the start of the test/wait operations so that the rest
>> of the checks for (MPI_SUCCESS != req->req_status.MPI_ERROR) complete
>> appropriately in those functions.
>>
>> So I am still failing to see why the patch is incorrect. Is it
>> 'incorrect' or just 'not implemented the way you think it should be'?
>> If the latter, then I'd love to see a patch. If the former, then I
>> would really like to understand why.
>
> I didn't say it is incorrect, just that it is an overkill that is (1) not required by the MPI standard; (2) potentially not thread safe; (3) sub-optimal in terms of memory writes; (4) way to complex for what MPI_ERR_PENDING is supposed to achieve. Moreover, I guess your patch is indeed correct if the MPICH test you were using pass.
>
>> Do we both agree that before this patch Open MPI did not support
>> MPI_ERR_PENDING?
>
> Who dares claim the opposite?
>
>  george.
>
>
>>
>> -- Josh
>>
>>
>> On Wed, Mar 21, 2012 at 2:45 PM, George Bosilca <bosilca_at_[hidden]> wrote:
>>> My point was that MPI_ERR_PENDING should never be set on a specific request. MPI_ERR_PENDING should only be returned in the array of statuses attached to MPI_Waitall. Thus, there is no need to remove it from any request.
>>>
>>> In addition, there is another reason why this is unnecessary (and I was too lazy to describe it in my previous email). The MPI_Waitall is only allowed to return MPI_ERR_PENDING for "not completed" MPI request. Such requests will therefore be completed later by the MPI library, and at that moment the error in the status will be set to the correct value, or MPI_SUCCESS.
>>>
>>>  george.
>>>
>>> On Mar 21, 2012, at 14:32 , Josh Hursey wrote:
>>>
>>>> In the patch for errhandler_invoke.c, you can see that we need to
>>>> check for MPI_ERR_PENDING to make sure that we do not free the request
>>>> when we are trying to decide if we should invoke the error handler. So
>>>> setting the internal req->req_status.MPI_ERROR to MPI_ERR_PENDING made
>>>> it possible to check for this state in this code. Maybe you have a
>>>> better way around this, but that is why I chose to implement it this
>>>> way.
>>>>
>>>> I agree that MPI_ERR_PENDING is only to be set in MPI_Waitall. And, in
>>>> this patch, MPI_ERR_PENDING is only set in that operation. The other
>>>> operations must look for the MPI_ERR_PENDING and reset the value -
>>>> since we are using the internal status object to carry it around per
>>>> above.
>>>>
>>>> You said that the implementation in ompi_request_default_wait_all was
>>>> incorrect. Would you care to elaborate as to what specifically is
>>>> incorrect?
>>>>
>>>> -- Josh
>>>>
>>>> On Wed, Mar 21, 2012 at 2:17 PM, George Bosilca <bosilca_at_[hidden]> wrote:
>>>>> Josh,
>>>>>
>>>>> I don't agree that these changes are required. In the current standard (2.2), MPI_ERR_PENDING is only allowed to be returned by MPI_WAITALL, in some very specific conditions. Here is the snippet from the MPI standard clarifying this behavior.
>>>>>
>>>>>> When one or more of the communications completed by a call to MPI_WAITALL fail, it is desireable to return specific information on each communication. The function MPI_WAITALL will return in such case the error code MPI_ERR_IN_STATUS and will set the error field of each status to a specific error code. This code will be MPI_SUCCESS, if the specific communication completed; it will be another specific error code, if it failed; or it can be MPI_ERR_PENDING if it has neither failed nor completed.
>>>>>
>>>>> As you can notice, the MPI_ERR_PENDING is only for the error in the status array and not for the error field in the status attached to the MPI request. Moreover, we don't use this inside Open MPI at all.
>>>>>
>>>>> The usage we're doing in the default_wait_all of MPI_ERR_PENDING is incorrect as well. I will fix it, once we clarify the issue with your patch.
>>>>>
>>>>> Thanks,
>>>>>  george.
>>>>>
>>>>> On Mar 21, 2012, at 13:46 , jjhursey_at_[hidden] wrote:
>>>>>
>>>>>> Author: jjhursey
>>>>>> Date: 2012-03-21 13:46:15 EDT (Wed, 21 Mar 2012)
>>>>>> New Revision: 26172
>>>>>> URL: https://svn.open-mpi.org/trac/ompi/changeset/26172
>>>>>>
>>>>>> Log:
>>>>>> Add support for MPI_ERR_PENDING - Per MPI 2.2 p 60
>>>>>>
>>>>>> Tested with:
>>>>>>  ompi-tests/mpich_tester/mpich_pt2pt/truncmult.c
>>>>>>
>>>>>>
>>>>>> Text files modified:
>>>>>>   trunk/ompi/errhandler/errhandler_invoke.c |     9 ++
>>>>>>   trunk/ompi/request/req_test.c             |    32 ++++++++++
>>>>>>   trunk/ompi/request/req_wait.c             |   120 +++++++++++++++++++++++++++++++++++++--
>>>>>>   trunk/ompi/request/request.c              |     2
>>>>>>   trunk/ompi/request/request.h              |     5 +
>>>>>>   5 files changed, 158 insertions(+), 10 deletions(-)
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Joshua Hursey
>>>> Postdoctoral Research Associate
>>>> Oak Ridge National Laboratory
>>>> http://users.nccs.gov/~jjhursey
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>
>>
>>
>> --
>> Joshua Hursey
>> Postdoctoral Research Associate
>> Oak Ridge National Laboratory
>> http://users.nccs.gov/~jjhursey
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey