Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: Re: [hwloc-devel] Bug report: topology strange on SGI UltraViolet
From: Samuel Thibault (samuel.thibault_at_[hidden])
Date: 2010-07-28 12:58:42

Bernd Kallies, le Wed 28 Jul 2010 18:09:28 +0200, a écrit :
> > > topology is understandeable. I'm wondering about "Group4", which
> > > contains the three "Group3" objects. lstopo should print "1534GB"
> > > instead of "1022GB". There is only one "Group4" object, and there are no
> > > other direct children of the root object.
> >
> > Indeed, there's something wrong.
> > Can you send the output of tests/linux/ so that I try
> > to debug this from here?
> Is attached.

Actually the Group4 object doesn't contain the three Group3 objects:

¤ grep 'Group[34]' gather-topology-uv.tar.gz.output
  Group4 #0 (total=1071374336KB)
    Group3 #0 (total=534634496KB)
    Group3 #1 (total=536739840KB)
  Group3 #2 (total=536739840KB)

You can also see it using
lstopo --gridsize 2 --fontsize 5
for instance.

So it seems all good to me.

> We have one UV rack, which is filled with 3/4 of the max. number of
> blades. According to the specs, two NUMA nodes form one "blade". This
> level corresponds to "Group0" in the hwloc topology. Two blades are
> cross-linked via the NUMAlink, forming "paired nodes" = "Group1". What
> "Group2" might correspond to - I don't know. "Group3" corresponds to one
> "chassis" or IRU. "Group4" may be an "enclosure", and "Machine" is the
> "rack".

Wow, it's impressive that hwloc actually finds out all this just from
the distance matrix :)

> From my opinion the hwloc topology for our machine should contain 2x
> Group4.

hwloc can not find Group4: it finds out groups from the distance matrix.
Since there are no two Group3 objects to group, it doesn't know some
notion of Group4 exists there.

> However, when walking the topology tree via the API, then it seems to
> contain correct details.

Yep :)