Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: Re: [hwloc-devel] hwloc-bind syntax
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2009-12-04 08:44:10


On Dec 4, 2009, at 5:36 AM, Brice Goglin wrote:

> > It might be good to safely ignore 0x if it's present, but that's a small feature enhancement that can be done at any time (I filed a future ticket).
>
> It seems to work actually :)

Hmm -- I don't think so...? "0x1" can't pass this test in hwloc_mask_process_arg():

  } else if (strlen(arg) == strspn(arg, "0123456789abcdefABCDEF,")) {

In my tests, it's falling through to the "err = -1" case, but just not printing out an error. Even more fun -- note the lack of error shown, and the lack of "ls" output, except for when we specify -v:

----
[8:33] rtp-jsquyres-8711:~/svn/hwloc % ./utils/hwloc-bind 0x1 ls
[8:33] rtp-jsquyres-8711:~/svn/hwloc % ./utils/hwloc-bind -v 0x1 ls
assuming the command starts at 0x1
execvp: No such file or directory
-----
If think that if execvp() fails, we should *always* print an error, not just if -v was specified.  I'll fix.
> > Linux is likely to be among the most popular target for hwloc -- so can you explain in good words definitions for the following:
[snipped]
Thanks.
> > Additionally -- the word "father" is used in the docs.  Should we use the gender-neutral "parent" instead?
> 
> I am not sure. The object structure contains a father pointer. We use
> parent in the API, but it might refer to different things, like father,
> grandfather, ...
FWIW, the english word "parent" definitely refers to the immediate ancestor.  It does *not* refer to grandparents or great-grandparents, etc.
> > What I meant by my question was -- aren't the 3 diagrams above equivalent to "core:6"? If so, what's the value of the foo.bar.baz notation?
> 
> If you have a 96 core machine like we do, the hierarchical notation
> (foo.bar.baz) is really nice. If I want to bind on
> node:2.socket:3.core:4, it's much easier than looking at the topology
> and finding that it's core:70.
Ah, ok.  Fair enough.
> Using physical or logical indexes doesn't
> change anything here. I agree that we don't do that often in real
> applications, but I actually use that quite a lot for my own debugging :)
Another good reason.  :-)
> I actually don't see why people would like to use physical numbers in
> such a hierarchical notation since physical socket/core numbers are
> often strange/illogical and nobody remembers them. However, I agree that
> the physical indexes are useful when *not* using a hierarchical
> notation, ie I want to bind on thread OS index #46.
As a server vendor, using physical/OS indexes is actually quite useful to me (e.g., to ensure that the hardware and OS are playing nicely).
My point is that everyone has a different view here -- we should just support both.  IMHO, the common case is logical indexes -- so let's make those the default.  But there are definitely cases where physical indexes are useful as well.
-- 
Jeff Squyres
jsquyres_at_[hidden]