Can't check the code right now, but a couple questions below.

One of the issues I had was that the Core IDs (as reported by Xen) are enumerated per socket rather than as an entire system.  The purpose of "HACK - patch up cpu_to_core." in hwloc_get_xen_info() is to change the per-socket enumeration to being per system.

Samuel believes that hwloc should be able to cope with duplicate core IDs with different cpusets, but if I attempt to do that, I get the following error:
* hwloc has encountered what looks like an error from the operating system.
* object (Core P#0 cpuset 0x30000003) intersection without inclusion!
* Error occurred in topology.c line 853
* Please report this error message to the hwloc user's mailing list,
* along with the output from the hwloc-gather-topology.sh script.

I don't understand what's going on here. Can you post the list of PU/core/socket IDs that Xen reports so that I see what is unique and what is not?

I currently have a crazy idea for getting at the cache information.  topology-x86.c has a lot of cpuid knowledge, and I have a proposed new hypercall which executes cpuid on a specific PU.  Would it be possible (or indeed sensible) to parametrise the code in topology-x86.c to take a few function pointers for get/set binding information, and for the cpuid call itself?

I don't see why we couldn't do that. Can you post an example of what the Xen cpuid hypercall prototype would be, so that I see how I need to change the x86 backend?

That way, the common x86 knowledge could be used correctly by the Xen component, while still keeping its current design.

Is there anything that the current Xen backend supports and that wouldn't be feasible through x86 cpuid? The x86 component can already detect a lot of topology information, including cores/caches/sockets/NUMA. Maybe the NUMA node sizes?

By the way, which architectures are supported by Xen aside of x86? Does Xen have topology information for them?