Obviously, I should have mentioned that you must pass --host=powerpc64-bgq-linux to configure. I will add a FAQ about this.


Le 11/02/2013 01:52, Erik Schnetter a écrit :

I tried using this tarball. Things didn't work. (This particular run used 2 MPI processes with 32 OpenMP threads each.)

In my application, I first output the topology in a tree structure. (I do this in my application instead of via one of hwloc's tools because I don't want to call out to shell code.) Then I output thread bindings, then modify the thread bindings, then output them again.

(1) The topology I find consists of 32 PUs and nothing else. I would have expected to find two cache levels, 16 cores, and 64 PUs.

(2) When outputting the thread bindings, I received a segfault. The lightweight core file says this was signal 6 (SIGABRT) in a routine called ".raise".

I'd be happy to help debug this. How?


On Sat, Feb 9, 2013 at 5:46 PM, Brice Goglin <Brice.Goglin@inria.fr> wrote:
The new "bgq" branch now contains proper topology for BG/Q nodes (including cores and caches, except the prefetching cache) as well as support for set/get binding of the current thread or of another thread. No process-wide binding since I don't know how to iterate over all threads of a process.

A tarball is available at:
(this is our new regression testing tool, I hope the tarball won't disappear too soon)

I don't expect a lot more features so this branch will likely go into trunk very soon. But if you can look at it, that'll be great.


Le 08/01/2013 18:06, Erik Schnetter a écrit :
I am trying to use hwloc on a Blue Gene/Q. Building and installing worked fine, and it reports the system configuration fine as well (i.e. it shows all PUs). However, when I try to inquire the thread/core bindings, hwloc crashes with an error in libc's free(). This is both with 1.6 and 1.6.1rc1.

The error occurs apparently in CPU_FREE called from hwloc_linux_find_kernel_nr_cpus.

Does this ring a bell with anyone? I know this is not enough information to debug things, but do you have any pointers for things to look at?

I remember reading somewhere that the last bit in a cpu_set_t cannot be used. A Blue Gene/Q has 64 PUs, and may be using 64-bit integers to hold cpu_set_t data. Could this be an issue?

My goal is to examine and experiment with thread/core bindings with OpenMP to improve performance.

hwloc-users mailing list

Erik Schnetter <schnetter@gmail.com> http://www.perimeterinstitute.ca/personal/eschnetter/