Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: Re: [hwloc-devel] SWIG bindings
From: Brice Goglin (Brice.Goglin_at_[hidden])
Date: 2010-12-03 15:51:26

Le 03/12/2010 21:42, Bernd Kallies a écrit :
>> We should really encourage people to use XML in such cases. Setting
>> HWLOC_XMLFILE=/path/to/exported/file.xml in the environment should just
>> work (as long as you update the XML file major hwloc releases or os).
>> Maybe we should add a dedicated section about this in the documentation?
>> Something like "Speeding up hwloc on large nodes"? And maybe even
>> encourage distro-packager to create a XML export file under /var/lib
>> with an advice to add HWLOC_XMLFILE to /etc/environment if they care
>> about hwloc/HPC?
>> Anyway Bernd, can you export a XML on this nice machine and reload it
>> and see how long it takes? I hope all the bottlenecks are in the Linux
>> backend parsing /sys and /proc, not in the actual hwloc core.
> I'm not sure if I understood. From my point of view it makes no sense to
> create an XML representation of the topology with hwloc, and then read
> this XML in again to be able to dive into it to figure out something.
> When there is an API that provides direct access to parts of the
> topology once it is constructed, then the XML thing is useless
> additional work.

Don't see the XML as a way to represent the topology and traverse it.
Just see it as a cache that you can read much faster than /proc and
/sys. Once you load the XML, you get the usual hwloc API.

> But this would not help us in many
> of our use cases. We have to analyze topologies that do not represent a
> whole machine. We analyze topologies that are bound to cpusets. We do
> this e.g. to construct pinning schemes for MPI applications that run
> inside of batch jobs, which get their cpusets created on the fly
> depending on their resource requests and current load of the machine.

Right, if the cpuset changes, caching in XML is useless (except if we
implement a way to restrict a given topology in the future).

> The
> problem here is rather, if every task running on a node should read the
> topology and figure out on which CPU it should pin itself, or if one
> does this by one master task on a node, and communicate the result to
> the others. But this is outside of hwloc.

Well, having hundreds of processes read /proc and /sys at the same time
is also another reason to use XML. The master can read the topology once
and pass it to all other processes through XML-files or
XML-buffers-over-socket. I assume that's what Open MPI will do in the
near future.