Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [EXTERNAL] RFC: Enable thread support by default
From: Ralph Castain (rhc_at_[hidden])
Date: 2012-12-10 13:25:47

On Dec 10, 2012, at 10:15 AM, "Barrett, Brian W" <bwbarre_at_[hidden]> wrote:

> On 12/8/12 7:59 PM, "Ralph Castain" <rhc_at_[hidden]> wrote:
>> WHAT: Enable both OPAL and libevent thread support by default
>> WHY: We need to support threaded operations for MPI-3, and for
>> Enabling thread support by default is the only way to
>> ensure we fix all the problems.
>> WHEN: COB, Thurs Dec 13
>> This was a decision reached at the OMPI Developers meeting, so the RFC is
>> mostly just a "heads up" to everyone that this will happen. We spent some
>> time recently profiling the impact on performance and found it to be
>> significant: 100ns in shared memory latency, and a similar number to TCP
>> message latency. However, without setting the support "on" by default, we
>> will never address those problems. Thus, the group decided that we would
>> enable support by default and being a concerted effort to reduce and/or
>> remove the performance impact.
> Thinking about this on the way home Friday, I'm not sure we need to go
> quite that far. I think we do want to enable MPI_THREAD_MULTIPLE by
> default to cause all the locks to be "on" by default. I'm not sure we
> need to enable progress threads at this point; the question is do we want
> to take a top-down approach, where we turn on the locks all the time for
> everything (expensive) and pare down what actually needs locking for async
> btl callbacks or do we leave off all the locking by default (when thread
> count == 1) and only turn on always-lock locks for the code paths that
> will deal with async callbacks from the BTLs. I'm split on the issue.

I viewed this in a different light. The question of thread_multiple is a separate one. From my perspective, if we say we are going to support MPI-3's async progress, then I don't see how we avoid the OPAL thread support being "on" all the time.

Likewise, if the ORTE wireup methods have to support async behavior, then we have to build the event lib with thread support.

So it seems to me that the best path forward is to turn both "on" by default, then learn how to live with that situation.

> Brian
> --
> Brian W. Barrett
> Scalable System Software Group
> Sandia National Laboratories
> _______________________________________________
> devel mailing list
> devel_at_[hidden]