Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Paul H. Hargrove (PHHargrove_at_[hidden])
Date: 2005-11-29 13:57:54


Jeff, et al.,

   My own "research" into processor affinity for the GASNet runtime
began by "borrowing" the related autoconf code from OpenMPI. My
experience is the same as Jeff's when it comes to looking for a
correlation between the API and any system parameter such as libc or
kernel version: not an exhaustive search, but enough to see that there
is no simple mapping.
   While far from "ideal", one option might be to perform an
installation-time probe w/ a dumbed down version of the autoconf probes
used at build time. This probe would then set the proper processor
affinity setting in a config file, an env var in the ISV's wrapper
around mpirun, or similar place. One can then have processor affinity
disabled if no setting is found and use the one selected at install time
if the setting is found.

-Paul

Jeff Squyres wrote:
> Greetings all. I'm writing this to ask for help from the general
> development community. We've run into a problem with Linux processor
> affinity, and although I've individually talked to a lot of people
> about this, no one has been able to come up with a solution. So I
> thought I'd open this to a wider audience.
>
> This is a long-ish e-mail; bear with me.
>
> As you may or may not know, Open MPI includes support for processor and
> memory affinity. There are a number of benefits, but I'll skip that
> discussion for now. For more information, see the following:
>
> http://www.open-mpi.org/faq/?category=building#build-paffinity
> http://www.open-mpi.org/faq/?category=building#build-maffinity
> http://www.open-mpi.org/faq/?category=tuning#paffinity-defs
> http://www.open-mpi.org/faq/?category=tuning#maffinity-defs
> http://www.open-mpi.org/faq/?category=tuning#using-paffinity
>
> Here's the problem: there are 3 different APIs for processor affinity
> in Linux. I have not done exhaustive research on this, but which API
> you have seems to depend on your version of kernel, glibc, and/or Linux
> vendor (i.e., some vendors appear to port different versions of the API
> to their particular kernel/glibc). The issue is that all 3 versions of
> the API use the same function names (sched_setaffinity() and
> sched_getaffinity()), but they change the number and types of the
> parameters to these functions.
>
> This is not a big problem for source distributions of Open MPI -- our
> configure script figures out which one you have and uses preprocessor
> directives to select the Right stuff in our code base for your
> platform.
>
> What *is* a big problem, however, is that ISVs can therefore not ship a
> binary Open MPI installation and reasonably expect the processor
> affinity aspects of it to work on multiple Linux platforms. That is,
> if the ISV compiles for API #X and ships a binary to a system that has
> API #Y, there are two options:
>
> 1. Processor affinity is disabled. This means that the benefits of
> processor affinity won't be visible (not hugely important on 2-way
> SMPs, but as the number of processors/cores increases, this is going to
> become more important), and Open MPI's NUMA-aware collectives won't be
> able to be used (because memory affinity may not be useful without
> processor affinity guarantees).
>
> 2. Processor affinity is enabled, but the code invokes API #X on a
> system with API #Y. This will have unpredictable results, the best
> case of which will be that processor affinity is simply [effectively]
> ignored; the worst case of which will be that the application will fail
> (e.g., seg fault).
>
> Clearly, neither of these solutions are attractive.
>
> My question to the developer crowd out there -- can you think of a way
> around this? More specifically, is there a way to know -- at run time
> -- which API to use? We can do some compiler trickery to compile all
> three APIs into a single Open MPI installation and then run-time
> dispatch to the Right one, but this is contingent upon being able to
> determine which API to dispatch to. A bunch of us have poked around
> and not found anything on the system that indicates which API you have
> (e.g., looked in /proc and /sys), but not found anything.
>
> Does anyone have any suggestions here?
>
> Many thanks for your time.
>

-- 
Paul H. Hargrove                          PHHargrove_at_[hidden]
Future Technologies Group
HPC Research Department                   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900