Open MPI logo

PLPA Users' Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all PLPA Users mailing list

Subject: Re: [PLPA users] [OMPI devel] OpenMPI, PLPA and Linux cpuset/cgroup support
From: Sylvain Jeaugey (sylvain.jeaugey_at_[hidden])
Date: 2009-07-24 08:38:06


On Fri, 24 Jul 2009, Jeff Squyres wrote:

> Is there any way for a process to tell the difference between "I can bind to
> completely different processors than I'm already bound to" (i.e., someone
> just happened to bind me to cores X, Y, and Z, but I'm free to bind to cores
> A, B, and C if I want to) and "I can only bind to cores within the set that
> I'm already bound to" (i.e., cpuset)?
It seems to me that noone just "happens" to bind a process to cores X, Y
or Z. If there is an explicit binding, there must be some reason for that
and getting "out" of this binding looks like doing something wrong to me.

If there is another mechanism that does binding, either disable it or take
it into account. In my [somewhat utopic] view, placement can't be done by
two entities (in general, either the launcher or the MPI library).

> Admittedly, I have not looked at libcpuset yet -- is this something that
> libcpuset does? If so, we could get that kind of functionality by linking
> into libcpuset (if it's available).
I don't really like creating a dependency on libcpuset, because having
cpuset functionnality and using it doesn't mean having libcpuset
installed. The majority of resource managers use the /dev/cpuset interface
directly and we are fine with it. In the past, we also had our own
libcpuset ; it was a bit different from the one from SGI (which must be a
lot better than ours) but we never really released it because people (i.e.
RMS, SLURM, and, as it looks like, Torque also) were just fine with the
/dev/cpuset pseudo-filesystem.

Since cgroups seem to work differently (I've not worked on cpusets for
quite a long time now, so I'm discovering things a bit here), working on
/dev/cpuset seems also broken. Which leaves us with just getting the
current affinity - a simple and universal rule.

> The reason that I suggest this is that (I *assume* that) rather than have
> callers have to link to both PLPA and libcpuset manually and attempt to merge
> the data between the two different data structures, there *may* be some value
> (i.e., simplicity) in having a single interface and set of data structures
> that can address both issues. Thoughts?
Done :-)

Sylvain