Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: Re: [hwloc-devel] hwloc with Xen system support - some initial code
From: Andrew Cooper (andrew.cooper3_at_[hidden])
Date: 2014-01-01 15:55:11

On 01/01/2014 20:17, Brice Goglin wrote:
> Le 01/01/2014 19:55, Andrew Cooper a écrit :
>>> There's a little fix attached to improve error management in instantiate().
>> I considered that, but hwloc_debug() disappears in a non-debug build
>> from what I can tell, so a regular user would just see that they have
>> explicitly asked for Xen and it has been skipped over.
>> What would be useful would be hwloc_warning()/error() which is
>> independent of --{en,dis}able-debug.
> Usually we just fprintf(stderr) when there are major errors, and
> warnings are only hwloc_debug() :)
> Given that hwloc is used by other software as a library, doing assert()
> in the discovery code isn't good. We could try to make
> hwloc_topology_load() fail entirely if Xen discovery failed. I don't
> think we ever do that anywhere currently. Other components fallback to
> the default backend (which just creates PU based on the number of
> processors returned by sysconf()).
> We could also have different behavior based HWLOC_XEN value. 1 means
> "try Xen or fallback", 2 means "try Xen and abort() if failed". This
> also comes back to what people will do with the Xen support. If only
> admins use it to check the host topology in lstopo, calling assert()
> maybe ok. If some libraries use hwloc and set HWLOC_XEN, we can't assert().
> By the way, if we want to avoid yet another hwloc env variable, we could
> drop HWLOC_XEN, change the Xen priority to 0, and just set
> HWLOC_COMPONENTS=xen to enable it.

I will see whether I can make this work.

>>>> First of all, I have hacked at the m4, by copying surrounding code, and
>>>> it now appears to work sanely for me, including --{en,dis}able-xen
>>>> configure options. I have no idea whether what I have done is appropriate.
>>> The attach patch should extend your code to also support building the
>>> xen component as a plugin so we don't enforce the libxen hard dependency
>>> inside the main hwloc library.
>> I was under the impression that the m4 so far would silently drop
>> support for Xen if the library was no available and the user didn't
>> explicitly --enable-xen, or have I misunderstood something?
> The problem is for distribution packages, not for custom builds.
> Packagers have to choose if Xen support is enabled or not. If they
> enable it, the package will require xen libs to be installed, and some
> users may complain when installing those if they don't care about Xen.
> If they don't enable it, people that care about Xen will need a custom
> build or a hwloc-xen package. With plugins, you can enable everything at
> compile time, but Xen libs are required at run-time only if you want to
> load the Xen plugin.

Oh yes - very good point. In which case the Xen part of hwloc really
should be a plugin. (Attempting to install libxenctrl into a regular
linux distro will almost certainly pull in Xen itself as well)

>>> Regarding component flags, the usual way to exclude ALL other components
>>> if to put ~0 instead of
>> I didn't want to exclude PCI devices, as dom0 does have an accurate view
>> of them. (Only at the moment, cant associate them with numa nodes).
> Putting PCI in that list of flags means that you exclude any other
> component that discovers PCI devices. But it doesn't prevent you from
> adding PCI discovery in the Xen backend itself.

But Xen itself has none of this information directly. In all cases,
hwloc should defer to the native PCI method, which is why I explciticly
didn't exclude it.

>>>> What is the canonical way of stating that a certain PU is offline? Xen
>>>> doesn't really do offline cpus of its own accord at the moment but does
>>>> have a hypercalls to explicitly online and offline cpus. In the case
>>>> that one is offline, I suspect my current code will cause the PU to fall
>>>> out of the rest of the topology, rather than stay within it and marked
>>>> differently.
>>> What we do in topology-linux.c is we clear the corresponding bit from
>>> the top-level online_cpuset hwloc_get_root_obj(topology)->cpuset
>>> Then the core will propagate that everywhere at the end of the discovery.
>> Does that mean that hwloc_insert_object_by_cpuset() will work correctly
>> inserting an object with a bit set in cpuset which is clear in the root
>> online_cpuset
> I think that's what Linux does (even if it's hard to check in the
> way-too-long code).
>> or will I have to explicitly construct each
>> sock/node/core object with correct cpuset and online_cpuset information?
> I don't think we ever do that in any backend (fortunately). The core
> takes care of propagating things up and down at the end of the discovery.
> Brice

Cool - I shall also have a play with this.