Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: Re: [hwloc-devel] Some practical hwloc API feedback
From: Brice Goglin (Brice.Goglin_at_[hidden])
Date: 2011-09-22 17:05:07

Le 22/09/2011 22:42, Ralph Castain a écrit :
> I guess I didn't get that from your documentation. Since caches sit
> between socket and core, they appear to affect the depth of the core
> in a given socket. Thus, if there are different numbers of caches in
> the different sockets on a node, then the core/pu level would change
> across the sockets.

No, the level always contain all elements of the same type (+depth for
caches), even if they are not at the same "distance" to the root (not

Let's say you have two single-core sockets. One with no cache. One with
a L1.
What happens is:
* first level/depth is socket, contains two sockets, cover all cores.
* level 2 is L2, single element, *does not cover all cores*
* level 3 is core, two elements.

The funky thing here is that the parent/child links between the first
socket and its core go across level 2 because nothing matches there. In
the first socket, you have Socket(depth1)->Core(depth3) while in the
second socket you have Socket(depth1)->Cache(depth2)->Core(depth3)

So what we call "depth" in hwloc, is not the number of parent/child
links between you and the root, it's really the number of levels between
you and the root, even if you don't have any parent in some of these levels.

Looks like we need to clarify this :)

>> People would walk the list of PUs, Cores, Sockets, NUMA nodes.
>> But when talking about Caches, I would rather see them ask "which cache
>> do I have above these cores?".
> But that isn't exactly how people use that info. Instead, they ask us to "map N processes on each L2 cache across the node", or to "bind all procs to their local L3 cache".

It seems to me that people asking for this already know a lot about the
topology. Random users don't know if there are L2 or L3 caches, if they
should bind to one or the other, ...

So these advanced users should be able to say "I know there's one L3 per
socket, so bind to the local socket" instead of "bind to the local L3".

Or say "I know there are 5 cores per L2, so map N processes per sets of
5 cores" instead of "map N procs on each L2". But yeah we're back to the
possibly-non-uniform hierarchy problem then. I see the mess.

> When dealing with large scale systems, it is much faster and easier to check these things -before- launching the job. Remember, on these systems, it can take minutes to launch a full-scale job! Nobody wants to sit there for that much time, only to find that the system doesn't support the requested operation.

Ok. Adding binding/support info to topology attributes should be easy