Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [hwloc-devel] understanding PCI device to NUMA node connection
From: Brice Goglin (Brice.Goglin_at_[hidden])
Date: 2011-11-28 16:45:46


Le 28/11/2011 22:34, Guy Streeter a écrit :
> This question may be more about understanding NUMA (which I barely do) than
> about hwloc, but perhaps you can help anyway.
>
> I have a customer with some HP Proliant DL580 G7 servers. HP supplied them
> with a block diagram of their system, and it shows two of the NUMA nodes
> connected to the PCI devices through an I/O Hub. The customer thinks hwloc
> ought to show the PCI devices associated with both of the NUMA nodes. I'm not
> sure how that's possible. hwloc shows them all under the first node.
>
> Is the association of the devices with the nodes correct? Can the devices
> actually be equally "close" to both of them?

Current Intel platforms have 2 QPI links going to I/O hubs. Most servers
with many sockets (4 or more) thus have each I/O hub connected to only 2
processors directly, so their distance is "equal" as you say.

However, some BIOS report invalid I/O locality information. I've never
seen anything correct on any server like the above actually. If that's
important to you, we can actually tweak hwloc to fix these BIOS bugs
with environment variables. If you want to force the PCI hostbridge
0000:00 near the socket containing PU #0-#7, set
HWLOC_PCI_0000_00_LOCALCPUS to "0xff" in your env.

> On a side note, hwloc-gather-topology apparently doesn't gather device
> information? I got the output from their system but can't see any devices when
> I use it as input to hwloc-info etc.

Yes, unfortunately PCI detection isn't based on reading files, so
there's no easy way to "dump" it during gather-topology.sh.

If we ever reimplement PCI detection by reading sysfs files only on
Linux (which means we drop the dependency on libpci), it might be
possible to dump it. But that's a lot of work.

Brice