Le 02/12/2010 22:25, Bernd Kallies a écrit :
>> Do you have any feel for if there are particular bottlenecks in hwloc / lstopo that make it take so long? I wonder if we should just attack those (if possible)...? Samuel and Brice have done all the work in the guts of the API, so they might know offhand if there are places that can be optimized or not...
> Hmm. I did no profiling. The machines in question have 64 NUMA nodes
> with 16 logical CPUs, each. The topology depth is 10. So parsing
> of /sys/devices/system/node/* and evaluating the distance matrix to
> fiddle out the topology tree should be quite expensive. But I guess this
> statement is trivial and does not help very much.
We should really encourage people to use XML in such cases. Setting
HWLOC_XMLFILE=/path/to/exported/file.xml in the environment should just
work (as long as you update the XML file major hwloc releases or os).
Maybe we should add a dedicated section about this in the documentation?
Something like "Speeding up hwloc on large nodes"? And maybe even
encourage distro-packager to create a XML export file under /var/lib
with an advice to add HWLOC_XMLFILE to /etc/environment if they care
Anyway Bernd, can you export a XML on this nice machine and reload it
and see how long it takes? I hope all the bottlenecks are in the Linux
backend parsing /sys and /proc, not in the actual hwloc core.
By the way, we're not the only project with little scalability problems
on very large machines: https://lkml.org/lkml/2010/12/3/19 :)