We are seeing a new architecture appearing in the very near future, and I'm not sure how hwloc will handle it. Consider the following case:
* I have a rack that contains multiple "hosts"
* each host consists of a box/shelf with common support infrastructure in it - it has some kind of controller in it, and might have some networking support, maybe a pool of memory that can be allocated across the occupants.
* in the host, I have one or more "boards". Each board again has a controller in it with some common infrastructure to support its local sockets - might include some networking that would look like NICs (though not necessarily on a PCIe interface), a board-level memory pool, etc.
* each socket contains one or more die. Each die runs its own instance of an OS - probably a lightweight kernel - that can vary between dies (e.g., might have a tweaked configuration), and has its own associated memory that will physically reside outside the socket. You can think of each die as constituting a "shared memory locus" - i.e., processes running on that die can share memory between them as it would sit under the same OS instance.
* each die has some number of cores/hwthreads/caches etc.
Note that the sockets are not sitting in some PCIe bus - they appear to be directly connected to the overall network just like a "node" would appear today. However, there is a definite need for higher layers (RMs and MPIs) to understand this overall hierarchy and the "distances" between the individual elements.
Any thoughts on how we can support this?