Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: Re: [hwloc-devel] hwloc-1.4 assertion failures on Linux/POWER7
From: Brice Goglin (Brice.Goglin_at_[hidden])
Date: 2012-02-02 07:47:15


Le 02/02/2012 10:53, Brice Goglin a écrit :
> Le 01/02/2012 04:12, Paul H. Hargrove a écrit :
>> The problem I reported below also exists in hwloc-1.4.1.
>> Additionally, I can reproduce the SEGVs with xlc which Chris Samuel
>> reported in
>> http://www.open-mpi.org/community/lists/hwloc-devel/2012/01/2738.php
>>
>> -Paul
>>
>> On 1/31/2012 5:56 PM, Paul H. Hargrove wrote:
>>> When running "make check" in hwloc-1.3.1 on a Linux/POWER7 system I see:
>>>> lt-linux-libnuma:
>>>> /users/phh1/OMPI/hwloc-1.3.1-linux-ppc64-gcc//hwloc-1.3.1/tests/linux-libnuma.c:53:
>>>> main: Assertion `hwloc_bitmap_isequal(set, set2)' failed.
>>>> /bin/sh: line 5: 21415 Aborted ${dir}$tst
>>>> FAIL: linux-libnuma
> I don't think I will be able to reproduce this one here unfortunately.
> This machine has three NUMA nodes: #0 has many CPUs. #1 doesn't exist.
> #2 and #3 have memory with CPUs. I can't emulate libnuma in such an
> environment. So debugging the linux-libnuma tests is hard.
>
> Can you the following code just above this assert in
> tests/linux-libnuma.c:53 and report what it says ?
>
> { char *a, *b;
> hwloc_bitmap_asprintf(&a, set);
> hwloc_bitmap_asprintf(&b, set2);
> printf("got %s instead of %s\n", b, a);
> }

I just pushed 3 fixes to trunk. Please try again with next nightly build
and report the failures. I am not very confident about the result
because libnuma may have yet another crazy behavior when it sees your
missing NUMA node #1 (it exists but it's empty in my setup). Let's hope
I can fix anything but this test, at least.

Brice