Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] RFC: Deprecate rankfile?
From: Eugene Loh (eugene.loh_at_[hidden])
Date: 2010-04-15 19:54:45

Jeff Squyres wrote:

>WHAT: Deprecate the "rankfile" component in the v1.5 series; remove it in the 1.7 series
>WHY: It's old, creaky, and difficult to maintain. It'll require maintenance someday soon, when we support hardware / hyperthreads in OMPI.
>WHERE: svn rm orte/mca/rmaps/rank_file
>WHEN: Deprecate in 1.5, remove in 1.7.
>TIMEOUT: Teleconf on Tue, 27 April 2010
>Now that we have nice paffinity binding options, can we deprecate rankfile in the 1.5 series and remove it in 1.7?
>I only ask because it's kind of a pain to maintain. It's not a huge deal, but Ralph and I were talking about another paffinity issue today and we both groaned at the prospect of extending rank file to support hyperthreads (and/or boards). Perhaps it should just go away...?
>Pro: less maintenance, especially since the original developers no longer maintain it
>Con: it's the only way to have completely custom affinity bindings (i.e., outside of our pre-constructed "bind to socket", "bind to core", etc. options).
Arguably, someone could always hack something together him/herself with
numactl, processor_binding(), scripts, etc.

> any other MPI's offer completely custom binding options?
Yes, "they all do." Okay, I haven't actually checked exhaustively, but
from what I can tell every MPI (now that OMPI has joined their ranks)
supports specification of process-bind mappings both via policy (a la
"bysocket") and via customized list. Specifically, I've checked:

*) MVAPICH2 (which uses PLPA! God help them)
*) Scali/Platform
*) Intel MPI (the full-blown set of options is very extensive)

>I.e., do any real users care?
I think so. Users end up wanting fine controls on all kinds of stuff.
I think I first ran into users wanting to bind processes about 15 years
ago. Issues include spreading/gathering processes on multicore nodes,
squelching migration, avoiding or targeting particular processes,
dealing with asymmetries within a node, etc. It just seems to keep
coming up over and over and over again. There's been a history of
problems (like excessive migration, poor initial placement, etc.) and
clever developers fixing these problems, but the constant has been the
existance of some subset of users who just want a stable workaround to
carry them through the broken/fixed cycles of history.

Note that various MPI implementations (also OMP, more on that in a
second) have spent a lot of time on process binding. Intel has a very
rich set of options. I don't know if that's because of demanding users
or because of developers drinking too much Red Bull, but their options
are much richer than ours. So, imagine a socket that has two caches,
with two cores per cache? Intel allows you to control placement on this
architecture via policy. We don't. So, what do our users do? Allowing
custom mappings remains the hammer that hammers every nail.

With regards to your two questions (what do others do? do users care?),
I believe (FWIW) that OpenMP implementations tend to support customized
mappings. E.g., Oracle Solaris Studio (formerly Sun Studio), GNU
libgomp, PGI OMP, and Intel OMP. I think the OMP ARB is considering a
standardized interface for binding threads, but I don't remember if it
was for policies, customized mappings, or both. Anyhow, custom mappings
go even beyond just MPI.

I'm sympathetic to the maintenance-cost issue. And the current
functionality is kind of awkward. (Even just the name "rank file" is a
puzzler.) And it's astounding that there is such extensive
diversification of interfaces out there among MPI and OMP
implementations (with only Intel making any attempt to have MPI and OMP
play well with each other). But there is no question that customized
binding mappings are commonplace.