Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: Re: [hwloc-devel] structure assumptions, duplication
From: Samuel Thibault (samuel.thibault_at_[hidden])
Date: 2009-09-29 10:59:33

Fawzi Mohamed, le Tue 29 Sep 2009 16:39:47 +0200, a écrit :
> Maybe I worry too much, but with machines with 1'000 of processor
> coming, and maybe wanting local restricted copies to know the topology
> of the whole machine (to communicate with others) I worry also about
> few pointers here an there.

Actually the size of all the pointers is not so much compared to the
size of the cpuset (by default 1024/8 = 128 bytes, worth 16 pointers).

> 2) assumption on the structure
> Also a ring like topology cannot be cleanly represented with a
> partition if one wants to have objects for groups with uniform latency.

Our plan was to not only provide a hierarcical view but also the precise
graph. For instance, that means that for an Altix machine with a
2D-mesh of 3*4 NUMA nodes, the hierarchy would be system containing
12 nodes, themselves containing a socket etc. And the fact that the
12 nodes are organized as a 2D-mesh would be expressed by a graph
structure, independently of the hierarchy.

> At least the Misc object would seem to not fit in this clean
> hierarchical picture.

The Misc object is meant to not be interpreted any other way than
just grouping, so it is still hierarchical. For instance AIX provides
a hierarchical view of the machine, and for some levels I don't
know what they correspond to (new levels, not documented or unclear
documentation), but hwloc still expose them. Windows has a particular
notion of processor groups due to bad design and it's a good idea to
take that into account.

The idea is that some programmers will only want to cope with a
hierarchical machine, while others will want to fine-tune according to
the precise topology (much harder, not polynomial at least). And thus we
should provide both. For the first kind of programmers, to tackle the
3*4 2D-mesh case above, we could provide a flag "make it hierarchical",
which would heuristically group nodes recursively according to locality.
For now we only do such grouping from the NUMA distances when there are
clear NUMA subgroups.

> So I wanted to know how you cope with those things, and also if
> something will probably change in the future, as some assumptions will
> inevitably creep in my code... and I would prefer to make the good
> ones :)

We should probably (agree on and) state what we want to always provide