Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI devel] processor affinity -- OpenMPI/batch system integration
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-01-11 09:38:49

carto is more intended to be a discovery and provider of topology
information. How various parts of the OMPI code base use that
information is a different issue.

With regards to processor affinity, there are two general ways of
doing it:

1. The resource manager tells us what processors have been allocated
to us. E.g., provide us some environment variables saying what
processors/cores/whatever have been allocated to us on a per-host
basis (e.g., in the environment of the launched applications, and
therefore may be different on every host). Then Open MPI decides how
to split up the allocated host processors amongst all the Open MPI
processes on that host.

It would be great if SGE could provide some environment variables to us.

2. The resource manager does all the processor affinity itself.
SLURM, for example, has a nice command line syntax for all kinds of
processor affinity stuff in their "srun" command. A traditional
roadblock to this has been that OMPI currently uses the resource
manager to launch a single "orted" process on each node, and then that
orted, in turn, launches all the MPI processes locally. However,
there is work progressing to remove this roadblock. If I try to
describe it, I'm sure I'll get it wrong :-) -- Ralph / IU?


Open MPI will need to be able to tell the difference between #1 and
#2. So it might be good if the RM always provides the environment
variables, but in those env variables, tell us whether the RM did the
affinity pinning or not. I.e., in #1, you'll get information about
all the processors that are available -- all the processes on a single
host will get the same information. In #2, each process will get
individualized information about where it has been pinned.

Make sense?

On Jan 11, 2008, at 6:22 AM, Pak Lui wrote:

> Hi Rayson,
> I guess this is an issue only for SGE. I believe there is something
> called 'carto' framework is being developed to represent the node-
> socket
> relationship in order to address the multicore issue. I think there
> are
> other folks in the team who are actively working on it so they
> probably
> can address it better than I can. Here some descriptions on the wiki
> for it:
> Rayson Ho wrote:
>> Hello,
>> I'm from the Sun Grid Engine (SGE) project (
>> ). I am working on processor affinity
>> support for SGE.
>> In 2005, we had some discussions on the SGE mailing list with Jeff on
>> this topic. As quad-core processors are available from AMD and Intel,
>> and higher core count per socket is coming soon, I would like to see
>> what we can do to come up with a simple interface for the SGE 6.2
>> release, which will be available in Q2 this year (or at least into an
>> "update" release of SGE6.2 if we couldn't get the changes in on
>> time).
>> The discussions we had before:
>> I looked at the SGE code, the simplest we can do is to set an
>> environment variable to tell the task group the processor mask of the
>> node before we start each task group. Is it good enough for OpenMPI??
>> After reading the OpenMPI code, I believe what we need to do is that
>> in ompi/runtime/ompi_mpi_init.c , we need to add an else case:
>> if (ompi_mpi_paffinity_alone) {
>> ...
>> }
>> else
>> {
>> // get processor affinity information from batch system via the
>> env var
>> ...
>> }
>> Thanks,
>> Rayson
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
> --
> - Pak Lui
> pak.lui_at_[hidden]
> _______________________________________________
> devel mailing list
> devel_at_[hidden]

Jeff Squyres
Cisco Systems