Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: Re: [hwloc-devel] Bug report: topology strange on SGI UltraViolet
From: Bernd Kallies (kallies_at_[hidden])
Date: 2010-07-28 14:59:19


On Wed, 2010-07-28 at 18:53 +0200, Brice Goglin wrote:
> Le 28/07/2010 18:09, Bernd Kallies a écrit :
> > Is attached. I also checked for cpusets. I ran lstopo and
> > gather_topology from the root cpuset, which is the only cpuset and
> > contains cpus 0-767 and mems 0-47, that is - the whole machine.
> >
> > Background info: The UltraViolet architecture is new. There exists a
> > white paper about this at http://www.sgi.com/pdfs/4192.pdf
> > We have one UV rack, which is filled with 3/4 of the max. number of
> > blades. According to the specs, two NUMA nodes form one "blade". This
> > level corresponds to "Group0" in the hwloc topology. Two blades are
> > cross-linked via the NUMAlink, forming "paired nodes" = "Group1". What
> > "Group2" might correspond to - I don't know.
>
> We group by distance, so it's look like there's something tagging these
> nodes as closer, and hwloc makes them a new group level
>
> > "Group3" corresponds to one
> > "chassis" or IRU. "Group4" may be an "enclosure", and "Machine" is the
> > "rack".
> >
> > From my opinion the hwloc topology for our machine should contain 2x
> > Group4. The 1st should contain 2x Group3, the 2nd one 1x Group3. lstopo
> > shows 1x Group4 containing 3x Group3, instead.
> >
>
> Actually no, but it's very hard to see :)
> lstopo - | egrep "(NUMA|Group)"
> shows that Group4#0 only contains Group3#0 and #1.
> Group3#2 is directly a child of the machine (the indentation is smaller).

Ah, I see.

> Open a *big* terminal window and look at the distance matrix:
> $ cat /sys/devices/system/node/node{?,??}/distance
> (I am not copy/pasting it here, it's too big :))
>
> hwloc groups objects that have smaller distances and then compute
> distances between groups (average between distances of objects in each
> group). We get:
>
> Distance matrix between Group0 objects:
> 13 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64 66
> 22 13 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64
> 24 22 13 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62
> 26 24 22 13 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60
> 28 26 24 22 13 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58
> 30 28 26 24 22 13 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56
> 32 30 28 26 24 22 13 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54
> 34 32 30 28 26 24 22 13 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52
> 36 34 32 30 28 26 24 22 13 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50
> 38 36 34 32 30 28 26 24 22 13 22 24 26 28 30 32 34 36 38 40 42 44 46 48
> 40 38 36 34 32 30 28 26 24 22 13 22 24 26 28 30 32 34 36 38 40 42 44 46
> 42 40 38 36 34 32 30 28 26 24 22 13 22 24 26 28 30 32 34 36 38 40 42 44
> 44 42 40 38 36 34 32 30 28 26 24 22 13 22 24 26 28 30 32 34 36 38 40 42
> 46 44 42 40 38 36 34 32 30 28 26 24 22 13 22 24 26 28 30 32 34 36 38 40
> 48 46 44 42 40 38 36 34 32 30 28 26 24 22 13 22 24 26 28 30 32 34 36 38
> 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 13 22 24 26 28 30 32 34 36
> 52 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 13 22 24 26 28 30 32 34
> 54 52 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 13 22 24 26 28 30 32
> 56 54 52 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 13 22 24 26 28 30
> 58 56 54 52 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 13 22 24 26 28
> 60 58 56 54 52 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 13 22 24 26
> 62 60 58 56 54 52 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 13 22 24
> 64 62 60 58 56 54 52 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 13 22
> 66 64 62 60 58 56 54 52 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 13
>
> Between Group1:
> 17 24 28 32 36 40 44 48 52 56 60 64
> 24 17 24 28 32 36 40 44 48 52 56 60
> 28 24 17 24 28 32 36 40 44 48 52 56
> 32 28 24 17 24 28 32 36 40 44 48 52
> 36 32 28 24 17 24 28 32 36 40 44 48
> 40 36 32 28 24 17 24 28 32 36 40 44
> 44 40 36 32 28 24 17 24 28 32 36 40
> 48 44 40 36 32 28 24 17 24 28 32 36
> 52 48 44 40 36 32 28 24 17 24 28 32
> 56 52 48 44 40 36 32 28 24 17 24 28
> 60 56 52 48 44 40 36 32 28 24 17 24
> 64 60 56 52 48 44 40 36 32 28 24 17
>
> Group2:
> 20 28 36 44 52 60
> 28 20 28 36 44 52
> 36 28 20 28 36 44
> 44 36 28 20 28 36
> 52 44 36 28 20 28
> 60 52 44 36 28 20
>
> Group3:
> 24 36 52
> 36 24 36
> 52 36 24
>
> The way I am reading this is:
> IRU#1 is close to IRU#0 and #2, but #0 and #2 are far away for each other.
> Then I don't think we can group 2 IRU and keep a third one on the side
> as you said.
> How would you group these?
>
> That said, something is going wrong with the grouping code. Right now,
> it should keep 3 Group3 under the machine. I am looking at it.

So it seems to me that you basically get a distance matrix of PU objects
from the system (the machine vendor), and probably you do agglomerative
average linkage cluster analysis on it to determine the number and
hierarchy of HWLOC_OBJ_GROUP objects (beyond what can be named by some
hardware building block like core or cache etc). Is this right?
I'm wondering if this is the right approach. Did you try other distance
functions (e.g. single linkage)?

Besides that, and from the viewpoint of a tree representation of the
result of clustering, I would expect that every pair of two objects of
same type have common anchestors of the same type. For the given UV
topology I would not expect that there are two Group3 that have a Group4
ancestor, while the 3rd Group3 is direct child of Machine. I would
expect EITHER that the 3rd Group3 is also child of a Group4 (maybe a
second one), OR that there is no Group4.

Sincerely BK

> Brice
>

-- 
Dr. Bernd Kallies
Konrad-Zuse-Zentrum für Informationstechnik Berlin
Takustr. 7
14195 Berlin
Tel: +49-30-84185-270
Fax: +49-30-84185-311
e-mail: kallies_at_[hidden]