Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: Re: [hwloc-devel] hwloc-1.3.1 assertion failures on Linux/POWER7
From: Brice Goglin (Brice.Goglin_at_[hidden])
Date: 2012-02-01 08:20:22


Le 01/02/2012 03:49, Christopher Samuel a écrit :
> With XLC and 1.3.1 and 1.4 I get plenty of warnings (compile logs for
> both attached) whilst compiling and then 4 failures in make check
> (accompanied with segmentation faults):
>
> samuel_at_tambo:~/HWLOC/hwloc-1.3.1> grep -B1 FAIL: log
> /bin/sh: line 1: 5267 Segmentation fault ${dir}$tst
> FAIL: hwloc_bind
> /bin/sh: line 1: 5285 Segmentation fault ${dir}$tst
> FAIL: hwloc_get_last_cpu_location
> /bin/sh: line 1: 5335 Segmentation fault ${dir}$tst
> FAIL: hwloc_is_thissystem
> /bin/sh: line 1: 5481 Segmentation fault ${dir}$tst
> FAIL: glibc-sched

All these tests involved binding, which is likely broken (see below).

"/vlsci/VLSCI/samuel/HWLOC/hwloc-1.3.1/include/hwloc.h", line 1203.28:
1506-1385 (W) The attribute "pure" is not a valid type attribute.
  CC traversal.lo

Attribute pure is before the function name, I'll move it after, XLC
doesn't seems to warn in this case.

"distances.c", line 62.42: 1506-404 (W) restrict can only qualify a
pointer type.
"distances.c", line 84.50: 1506-404 (W) restrict can only qualify a
pointer type.
"distances.c", line 226.40: 1506-404 (W) restrict can only qualify a
pointer type.

XLC may be wrong here, topology_t is typedef'ed to a pointer...

"topology-linux.c", line 303.33: 1506-280 (W) Function argument
assignment between types "unsigned int" and "struct {...}*" is not allowed.
"topology-linux.c", line 303.27: 1506-098 (E) Missing argument(s).
"topology-linux.c", line 391.32: 1506-280 (W) Function argument
assignment between types "unsigned int" and "struct {...}*" is not allowed.
"topology-linux.c", line 391.26: 1506-098 (E) Missing argument(s).
"topology-linux.c", line 715.40: 1506-280 (W) Function argument
assignment between types "unsigned int" and "struct {...}*" is not allowed.
"topology-linux.c", line 715.34: 1506-098 (E) Missing argument(s).
"topology-linux.c", line 807.40: 1506-280 (W) Function argument
assignment between types "unsigned int" and "struct {...}*" is not allowed.
"topology-linux.c", line 807.34: 1506-098 (E) Missing argument(s).

This looks very bad. It means something screwed the already very complex
sched_setaffinity detection code.
Does XLC redefine its own sched_setaffinity functions? Can you find the
relevant header file and send it?
PGI had similar problems at some point. That's very annoying.
This explains why binding tests broke.

Brice