Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: Re: [hwloc-devel] comments on API changes
From: Fawzi Mohamed (fawzi_at_[hidden])
Date: 2010-04-02 09:03:05


On 2-apr-10, at 14:49, Brice Goglin wrote:

> Fawzi Mohamed wrote:
>> I would take advantage more info about the possible numa node
>> connectivity (to know where to steal tasks), but I don't have access
>> to machines that would really take advantage of that, and probably
>> even then using the HW structure as topology would not bad.
>
> NUMA connectivity is a big problem. Most x86 machines show nothing
> interesting in cat /sys/devices/system/node/node*/distance. On AMD
> hypertransport, there are some routing information in lspci, but I
> think
> you have to be root to see it, and I am not even sure it's enough to
> discover the actual HT graph.
>
> We have a "measurement-based backend" in the TODO-list. It could be
> the
> only way to find out the NUMA connectivity.

nice

> That said, it's not clear that this will be a big problem. AMD
> Magny-Cours machines and Nehalem-EX machines with 2-4 sockets have a
> fully-interconnected NUMA graph. No problem if we don't have NUMA
> topology information there.. For larger machines, SGI is already
> doing a
> good job at reporting NUMA distances in sysfs. What remains is other
> large machines (several vendors announced 8-socket Nehalem EX
> machines).
> We'll see.

indeed as I said the max I have access are 2 socket AMD or Nehalem-EX,
so it doesn't matter.
For now I just have:
PU core soket/numa_node machine system
and for the for the foreseeable future that is probably good enough
(all nodes in the same machine are at the same distance).
I don't aim at using this system between different machines...

Fawzi