Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: Re: [hwloc-devel] hwloc-bind syntax
From: Brice Goglin (Brice.Goglin_at_[hidden])
Date: 2009-12-03 12:26:39


Jeff Squyres wrote:
> I was trying to use hwloc-bind this morning, and I was a bit confused by the syntax. I see that the help message says:
>
> -----
> Usage: topobind [options] <location> -- command ...
> <location> may be a space-separated list of cpusets or objects
> as supported by the hwloc-mask utility.
> -----
>
> (shouldn't that say hwloc-bind, not topobind?)
>

Right :)

> I assume the <string> here in hwloc-mask is the same as the <location> in hwloc-bind.
>

Yes.

> 1. Is the index syntax "X,Y[,Z[...]]" supported? I don't see it on the list, but was curious if it is supported anyway. E.g., "proc:0,1,4".

No I don't think it's supported right now.

> That would seem useful (slightly shorter than "proc:0.proc:1.proc:4"). I can file a feature request if it's not already supported.
>

Actually, it would proc:0 proc:1 proc:4 (space separated).
hwloc-bind/mask do a logical/cpuset OR of all objects/masks given on the
command-line.

> 2. What does it mean to "hwloc-bind core:0 ..."? (I asked Samuel this in IM as well, but I didn't understand his answer). *Which* "core 0" does that refer to? For example, an abbreviated version of my lstopo output is as follows (it's a pre-production EX machine -- I can't share all the details -- I 'x'ed out some of the numerical values):
>
> -----
> System(xxxGB)
> Node#0(xxxGB) + Socket#0 + L3(xxxMB)
> L2(xxxKB) + L1(xxxKB) + Core#0 + P#0
> ...
> Node#1(xxxGB) + Socket#2 + L3(xxxMB)
> L2(xxxKB) + L1(xxxKB) + Core#0 + P#1
> ...
> -----
>
> The processors have unique numbers, but the cores do not. Is that a bug?
>

These are physical/OS indexes, not logical indexes.

hwloc-bind/mask takes logical indexes, no it has nothing to do with the
above #N. core:1 means "the second Core object" when you the above
output from top to bottom.

> 3. What is the difference between "system" and "machine"?
>

Machine is a physical machine. System may be be different in case of
Single System Image like Kerrighed, vSMP, ... (only Kerrighed is
supported so far).

> 4. What exactly does "index" refer to -- is it a virtual index (e.g., hwloc's numbering of 0-N) or is it the OS's index? I thought we used OS index numbering, but #2 confuses me -- if #2 is just a bug, then perhaps this question is moot. :-)
>

We use virtual/logical/OS index everywhere, except in the lstopo output
and in the functions that contain os_index in their prototype.

> 5. What exactly is a "cpuset string"? Can some examples be provided?
>

It's 0 for nothing, ffffffff for 32procs, 1,,,,,,,,1 for the the first
and the 257th processors. It's a comma separated list of 32bits bitmak.

> --> Sidenote: I actually find hwloc's use of the word "cpuset" to be quite confusing because it is *NOT* the same as an OS cpuset.

The structure might be a bit different, but it is conceptually the same
than the OS cpuset. When bit N is set in a hwloc cpuset, it means we are
talking about the processor whose *OS-index* is N.

> 6. "several <depth:index> may be concatenated with `.'..." Does that mean that this is legal:
>
> core:0.node:2.system:4
>
> If so, what exactly does it mean when they overlap? Is it simply the union of those 3 specifications?

It means 5th logical system below 3rd logical node below first core. So
it means nothing when there are no node objects below cores or no
systems below nodes.

> Also, I'm curious -- why was a period chosen as the delimiter instead of a comma? Is this a Europe-vs-US thing? (i.e., in the US, we typically use commas for lists -- is it different in Europe?)
>

We use commas for lists in Europe too. But The above is not a list, it's
a inclusion. See it as core[0].node[2].system[4] in C language.

Brice