Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: Re: [hwloc-devel] 1.3.2rc1 has escaped
From: Brice Goglin (Brice.Goglin_at_[hidden])
Date: 2012-02-20 16:26:09


Le 08/02/2012 22:33, Paul H. Hargrove a écrit :
> Tests on the virtual node I have access to where that problem report
> originated is still not quite right.
> There is now a different assertion failing than I had seen before:
>> lt-linux-libnuma:
>> /users/phh1/OMPI/hwloc-1.3.2rc1-linux-ppc64-gcc//hwloc-1.3.2rc1/tests/linux-libnuma.c:83:
>> main: Assertion `!memcmp(&nodemask, &numa_all_nodes,
>> sizeof(nodemask_t))' failed.
>> /bin/sh: line 5: 19416 Aborted ${dir}$tst
>> FAIL: linux-libnuma
>
> I don't have any clue if that represents forward or backward progress.

Can you try the attached patch?

It removes nodemask checks (this deprecated interface is too
buggy/strange in libnuma, no way to assert its behavior reliably).
Then, it fixes the libnuma helpers to properly use os_index instead
logical_index (important on your machine because node ids are sparse).
And finally it makes sure the test actually checks what we want
(shouldn't matter in your case).

I've tested this on your topology, a 8-node machine with out-of-order
numa node ids, and some basic nodes, with a recent and a less recent
libnuma release.

My current plan is to apply all these in all branches and then remove
the nodemask conversion helpers from trunk.

Brice