Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [hwloc-devel] hwloc-1.4 assertion failures on Linux/POWER7
From: Brice Goglin (Brice.Goglin_at_[hidden])
Date: 2012-02-02 04:53:14


Le 01/02/2012 04:12, Paul H. Hargrove a écrit :
> The problem I reported below also exists in hwloc-1.4.1.
> Additionally, I can reproduce the SEGVs with xlc which Chris Samuel
> reported in
> http://www.open-mpi.org/community/lists/hwloc-devel/2012/01/2738.php
>
> -Paul
>
> On 1/31/2012 5:56 PM, Paul H. Hargrove wrote:
>> When running "make check" in hwloc-1.3.1 on a Linux/POWER7 system I see:
>>> lt-linux-libnuma:
>>> /users/phh1/OMPI/hwloc-1.3.1-linux-ppc64-gcc//hwloc-1.3.1/tests/linux-libnuma.c:53:
>>> main: Assertion `hwloc_bitmap_isequal(set, set2)' failed.
>>> /bin/sh: line 5: 21415 Aborted ${dir}$tst
>>> FAIL: linux-libnuma

I don't think I will be able to reproduce this one here unfortunately.
This machine has three NUMA nodes: #0 has many CPUs. #1 doesn't exist.
#2 and #3 have memory with CPUs. I can't emulate libnuma in such an
environment. So debugging the linux-libnuma tests is hard.

Can you the following code just above this assert in
tests/linux-libnuma.c:53 and report what it says ?

{ char *a, *b;
  hwloc_bitmap_asprintf(&a, set);
  hwloc_bitmap_asprintf(&b, set2);
  printf("got %s instead of %s\n", b, a);
}

>>
>> The xlc-built hwloc-1.3.1 also fails an additional test:
>>> lt-glibc-sched:
>>> /users/phh1/OMPI/hwloc-1.3.1-linux-ppc64-xlc-11.1//hwloc-1.3.1/tests/glibc-sched.c:43:
>>> main: Assertion `!err' failed.
>>> /bin/sh: line 5: 7077 Aborted ${dir}$tst
>>> FAIL: glibc-sched

This one should go away once sched_setaffinity vs XLC problems are fixed.

Brice