Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] OpenMPI 1.2.5 race condition / core dump with MPI_Reduce and MPI_Gather
From: Gleb Natapov (glebn_at_[hidden])
Date: 2008-02-29 14:19:08

On Thu, Feb 28, 2008 at 04:53:11PM -0500, George Bosilca wrote:
> In this particular case, I don't think the solution is that obvious. If
> you look at the stack in the original email, you will notice how we get
> into this. The problem here, is that the FREE_LIST_WAIT is used to get a
> fragment to store an unexpected message. If this macro return NULL (in
> other words the PML is unable to store the unexpected message), what do
> you expect to happen ? Drop the message ? Ask the BTL to hold it for a
> while ? How about ordering ?
In all cases where we use FREE_LIST_WAIT from a callback today a solution
will not be simple otherwise it would be already implemented. In this
particular case if we will wait till memory allocation fails it is too
late to do anything useful, so printing helpful message and aborting is
good enough. In order to not get into the situation when all memory is
occupied by unexpected messages we either will have to implement some
kind of flow control in OB1 (and became more spec compliant in the
process) or declare all those programs that exhibit that kind of
behaviour "unrealistic" like we do now.

> It is unfortunate to say it, only few days after we had the discussion
> about the flow control, but the only correct solution here is to add PML
> level flow control ...
> george.
> On Feb 28, 2008, at 2:55 PM, Christian Bell wrote:
>> On Thu, 28 Feb 2008, Gleb Natapov wrote:
>>> The trick is to call progress only from functions that are called
>>> directly by a user process. Never call progress from a callback
>>> functions.
>>> The main offenders of this rule are calls to OMPI_FREE_LIST_WAIT().
>>> They
>>> should be changed to OMPI_FREE_LIST_GET() and dial with NULL return
>>> value.
>> Right -- and it should be easy to find more offenders by having an
>> assert statement soak in the builds for a while (or by default in
>> debug mode).
>> Was if it was ever part of the (or a) design to allow re-entrant
>> calls to progress from the same calling thread ? It can be done but
>> callers have to have a holistic view of how other components require
>> and make the progress happen -- this isn't compatible with the Open
>> MPI model of independent dynamically loadable components.
>> --
>> christian.bell_at_[hidden]
>> (QLogic Host Solutions Group, formerly Pathscale)
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]

> _______________________________________________
> users mailing list
> users_at_[hidden]