Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: Re: [hwloc-devel] Bug report: topology strange on SGI UltraViolet
From: Brice Goglin (Brice.Goglin_at_[hidden])
Date: 2010-07-28 14:36:02


Le 28/07/2010 18:53, Brice Goglin a écrit :
> Distance matrix between Group0 objects:
> 13 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64 66
> 22 13 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64
> 24 22 13 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62
> 26 24 22 13 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60
> 28 26 24 22 13 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58
> 30 28 26 24 22 13 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56
> 32 30 28 26 24 22 13 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54
> 34 32 30 28 26 24 22 13 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52
> 36 34 32 30 28 26 24 22 13 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50
> 38 36 34 32 30 28 26 24 22 13 22 24 26 28 30 32 34 36 38 40 42 44 46 48
> 40 38 36 34 32 30 28 26 24 22 13 22 24 26 28 30 32 34 36 38 40 42 44 46
> 42 40 38 36 34 32 30 28 26 24 22 13 22 24 26 28 30 32 34 36 38 40 42 44
> 44 42 40 38 36 34 32 30 28 26 24 22 13 22 24 26 28 30 32 34 36 38 40 42
> 46 44 42 40 38 36 34 32 30 28 26 24 22 13 22 24 26 28 30 32 34 36 38 40
> 48 46 44 42 40 38 36 34 32 30 28 26 24 22 13 22 24 26 28 30 32 34 36 38
> 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 13 22 24 26 28 30 32 34 36
> 52 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 13 22 24 26 28 30 32 34
> 54 52 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 13 22 24 26 28 30 32
> 56 54 52 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 13 22 24 26 28 30
> 58 56 54 52 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 13 22 24 26 28
> 60 58 56 54 52 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 13 22 24 26
> 62 60 58 56 54 52 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 13 22 24
> 64 62 60 58 56 54 52 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 13 22
> 66 64 62 60 58 56 54 52 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 13
>
> Between Group1:
> 17 24 28 32 36 40 44 48 52 56 60 64
> 24 17 24 28 32 36 40 44 48 52 56 60
> 28 24 17 24 28 32 36 40 44 48 52 56
> 32 28 24 17 24 28 32 36 40 44 48 52
> 36 32 28 24 17 24 28 32 36 40 44 48
> 40 36 32 28 24 17 24 28 32 36 40 44
> 44 40 36 32 28 24 17 24 28 32 36 40
> 48 44 40 36 32 28 24 17 24 28 32 36
> 52 48 44 40 36 32 28 24 17 24 28 32
> 56 52 48 44 40 36 32 28 24 17 24 28
> 60 56 52 48 44 40 36 32 28 24 17 24
> 64 60 56 52 48 44 40 36 32 28 24 17
>
> Group2:
> 20 28 36 44 52 60
> 28 20 28 36 44 52
> 36 28 20 28 36 44
> 44 36 28 20 28 36
> 52 44 36 28 20 28
> 60 52 44 36 28 20
>
> Group3:
> 24 36 52
> 36 24 36
> 52 36 24
>

Actually, all these distance matrices (except the NUMA nodes' one, the
one not included above) show a ring topology without the link between
the first and the last object. So grouping makes no sense there. hwloc
1.0.x groups object #2N with object #2N+1 because its grouping algorithm
isn't very clever. It could also link #2N-1 with #2N, it wouldn't be
worse. The grouping algorithm is more clever in svn trunk. It detects
this ring properly and does not group anything (except pairs of NUMA node).

It's actually surprising that this machine doesn't show a better
distance matrix. I guess SGI still has a hypercube or whatever nice
topology interconnected IRUs and blades. Older Altix machines had very
nice distance matrices were we would detect multiple levels of groups
that really showed the physical hierarchy of blades/IRUs/... I wonder if
your SGI BIOS is buggy :)

Michael Raymond, anything to say about this?

Brice