Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: Re: [hwloc-devel] specifying I/O devices on the command-line
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2011-04-12 09:14:00


On Apr 12, 2011, at 8:10 AM, Brice Goglin wrote:

> I am looking for a good way to specify PCI and OS devices on the
> command-line (for hwloc-calc and hwloc-bind).
>
> The trunk currently supports:
> * os:foobar with for OS device named foobar (eth0, mlx4_0, ...)
> * pci:0000:00:00.0 or pci:00:00.0 for a given PCI device
> * pci:aaaa:bbbb:c for the c-th PCI device with vendor ID aaaa and device
> ID bbbb
>
> The idea is basically to make it easy to bind processes near some
> high-performance devices:
> hwloc-bind os:mlx4_0 <mympibenchmark>
> hwloc-bind pci:nvidia:tesla:0 <mycudabenchmark>

Nifty.

Can you list multiple devices? E.g.:

  hwloc-bind os:mlx4_0 os:mlx4_1 my_mpi_benchmark

Also, is there a CLI way to retrieve which numa nodes / OS processors are near such devices? I can imagine wanting to script up something like:

- retrieve a mask / list of processors near OS device <foo>
- binding N processes, one per processor, to the processors near that device

> Ideally, the os:foobar notation would be enough. But as long as we don't
> have any OS name associated with (proprietary) GPUs, people will have to
> identify GPUs by their PCI ids.
>
> Other ideas that we may want so support:
> * PCI devices by name: something like the 2nd PCI device whose name
> contains "tesla C2070" so that people don't have to dig into lspci
> manually to find out the vendor/device IDs or busids (mostly useful for
> GPUs that have no OS names)

I immediately had that question when I read your 2nd example, above (i.e., where did you get the names from?). Are these names in the lstopo output?

> * OS devices by class: something like os:net:2 for the 2nd network
> interface (not sure it's useful)

I'm not sure it is -- isn't the ordering of PCI devices non-deterministic between cold boots?

> If we want to make all the above even more unreadable, we could also
> support specifying a range of indexes (pcifoo:start:amount or
> pcifoo:start-end) like we already do for non-I/O objects. But I am not
> sure this is actually useful. People may still specify pcifoo:0 pcifoo:1
> pcifoo:2 whenever needed.

For consistency, it doesn't sound like a bad idea.

> I/O devices will not be supported through the generic hierarchical
> notation "socket:1.core:2..." anyway. So we could make their
> command-line specification totally different from the usual one.
>
>
> It's actually the first time we select objects on something different
> than just a type or a depth and some indexes. So we could introduce a
> new syntax here. For instance:
> <type>[attributename=attributevalue,...]:index
> <type>[attributename=attributevalue,...]:firstindex:lastindex
> <type>[attributename=attributevalue,...]:firstindex:amount
> Not sure it's worth doing this.

It might be better to just put out basic functionality in 1.3 and *not* do advanced syntax like this (i.e., only do basic syntax). And then see what people ask for.

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/