Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: Re: [hwloc-devel] hwloc-bind syntax
From: Brice Goglin (Brice.Goglin_at_[hidden])
Date: 2009-12-03 16:55:02


Jeff Squyres wrote:
> I haven't looked at the argv parsing -- does it just strcmp each of the argv's and look for a recognized prefix, and if so, assume that it is a specification? If it doesn't find a recognized prefix, it assumes that it's the first argv of the tokens to exec (and therefore stop examining argv)? FWIW, this is pretty much what mpirun does.
>
> Is "--" recognized, too?
>

Yes, -- is recognized. And a couple days ago I changed the code so that
the first non-recognized argument is considered as the beginning of the
exec command line.

>>> 3. What is the difference between "system" and "machine"?
>>>
>> Machine is a physical machine. System may be be different in case of
>> Single System Image like Kerrighed, vSMP, ... (only Kerrighed is
>> supported so far).
>>
>
> Do we have good descriptions for each of the scope names that can be put in the docs?

Should be in the hwloc_obj_type enum in hwloc.h

> hwloc-mask shows the following names:
>
> system, machine, node, socket, core, proc[essor]
>
> Has anyone contacted Penguin and/or XHPC (and/or any other SSI projects) to see if they care about being supported by hwloc?
>

Your friends Joshua from Penguin is supposed to contact me back soon and
we're supposed to talk about hwloc (and OMX).

I don't think we've had any contact with other SSI projects.

>> We use virtual/logical/OS index everywhere, except in the lstopo output
>> and in the functions that contain os_index in their prototype.
>>
>
> Hmm - I can't parse that. You seem to be equating logical == virtual == OS indexing in that statement, but you distinctly called OS and logical indexing different in text higher up in this reply...
>

Ah sorry, "OS" wasn't supposed to appear in the above. logical ==
virtual != OS.

> Regardless, I find this confusing -- I'm quite sure that newbies will also find it confusing. All of hwloc should default to one form of indexing (regardless of whether it's physical/OS or some form of logical/hwloc-imposed indexing) -- and/or be explicit about which kind of indexing is used in every case.
>
> To be clear: it's strange to me that you can't use the numbers in the output from lstopo as arguments to hwloc-bind. I think that this will be quite a common / useful usage pattern: look up your machine's topology with lstopo and then hwloc-bind a command to something that you see in the lstopo output.
>
> At a minimum, I would think that all the CLI commands should default to the same kind of indexing to prevent confusion.
>
> Perhaps hwloc CLI tools should be able to show/accept *both* kinds of indexing...? E.g.:
>
> lstopo --physical
> lstopo --logical
>

Agreed.

> hwloc-bind --physical ...
> hwloc-bind --logical ...
>

Maybe too, yeah.

> Ah, ok. To be clear, is it accurate to say that it is one of the following forms:
>
> - a hex number (without leading "0x" -- would "0x" be ignored if it is supplied?)
>

We never used 0x there.

> - a comma-delimited set of 32bit bitmasks where MSB 0's do not have to be listed
>

MSB for the whole cpuset, and MSB inside 32bit bitmasks are not needed.
And if a bitmask is empty, it's not needed either, except if it's the
last one.

> I guess what I find confusing is that Linux's concept of a "cpuset" is a binding term (e.g., it's the set of cpu's assigned to a process and you can't break out of that set). The hwloc docs glossary says:
>
> ----
> CPU set The set of logical processors logically included in an object (if any). This term does *not* have any relation to an operating system “CPU set.”
> -----
>
> So we're specifically stating in the docs that they're different. And it seems like they *are* different -- yes, they're both "sets of CPUs", but at least the Linux definition of "cpuset" has additional connotations / meaning (I don't know if other OS's define the term "cpuset").
>

We might want to drop the Linux "cpuset" word and use "cgroup" instead.
Both are supported by Linux, but the latter now contains the former and
more, so people are supposed to use cgroup now. hwloc supports both.

> Ahh... now I see. So it's meant to be a logical delve into the topology -- the leftmost item is meant to be the highest item in the topology, and each "." item must be a child of the item to its left.
>

Yes, and child or grand-child or grand-grand-...

> Does it always need to start with system?

You don't care about starting with system or something else. You can
ignore the system level as you could ignore the socket level between
nodes and cores.

If you have 1 system with 2 nodes with 2 sockets each with 2 cores each,
you get:
node:1 core:2 is equivalent to system:0 node:1 socket:2 core:0 and
equivalent to system:0 core:6

But you cannot be that flexible with OS/physical indexes since multiple
cores/sockets may have the same index.

Brice