On 30-set-09, at 09:29, Samuel Thibault wrote:


Hi Samuel,

Fawzi Mohamed, le Wed 30 Sep 2009 09:16:36 +0200, a écrit :
1) a fully hierarchical representation of the machine/hardware where each level
is a partition, and each level fully covers the previous one (from any node you
go through all levels using father/childrens, father/child are just one level
away from each other.
This is basically what is there now.

2) outside the hierarchy 1 (but built using its object, probably the NUMA
nodes) there will be
2.1) maybe the full connection graph
2.2) a hierarchical view of it, like the lgroups, where the levels are not
necessarily a partition, and that could also refer not to the sublevel, but
directly to lower levels. Going up the hierarchy you get the next neighbors.

Err, no, in our plans 2.2 was in 1) already, and levels are thus still
partitions, but somehow arbitrary ones, according to heuristics based on
the graph. Isn't that the case with lgroups ? (I haven't ever had access
to a solaris numa machine)

If you look at the example described in the document that I had linked


you see the that for a ring topology some level (that you always get adding the next neighbors) do not form a partition (i.e. they overlap), such an overlap unavoidable if to build the next higher hierarchy you simply add the next neighbors.
Having a partition is very useful when, for instance instead of looking for a resource you want to restrict/pin a thread, for this reasons there are psets and lpls (lgroup partition loads, intersection of lgrops and processor partition, which are again a partition), and both are used on opensolaris.
Well you don't have to mirror what solaris does, but I found that quite nice, so I was thinking you wanted to go in that direction.
For a the ring topology a-b-c-d-a is difficult to find a good partition... and having both partition and non partition views (one used for resource allocation/distribution, the other for resource finding/stealing), is quite clean imho.