Open MPI logo

Hardware Locality Users' Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Users mailing list

Subject: Re: [hwloc-users] hwloc on Blue Gene/Q?
From: Brice Goglin (Brice.Goglin_at_[hidden])
Date: 2013-02-11 02:51:16


Obviously, I should have mentioned that you must pass
--host=powerpc64-bgq-linux to configure. I will add a FAQ about this.

Brice

Le 11/02/2013 01:52, Erik Schnetter a écrit :
> Brice
>
> I tried using this tarball. Things didn't work. (This particular run
> used 2 MPI processes with 32 OpenMP threads each.)
>
> In my application, I first output the topology in a tree structure. (I
> do this in my application instead of via one of hwloc's tools because
> I don't want to call out to shell code.) Then I output thread
> bindings, then modify the thread bindings, then output them again.
>
> (1) The topology I find consists of 32 PUs and nothing else. I would
> have expected to find two cache levels, 16 cores, and 64 PUs.
>
> (2) When outputting the thread bindings, I received a segfault. The
> lightweight core file says this was signal 6 (SIGABRT) in a routine
> called ".raise".
>
> I'd be happy to help debug this. How?
>
> -erik
>
>
>
>
> On Sat, Feb 9, 2013 at 5:46 PM, Brice Goglin <Brice.Goglin_at_[hidden]
> <mailto:Brice.Goglin_at_[hidden]>> wrote:
>
> The new "bgq" branch now contains proper topology for BG/Q nodes
> (including cores and caches, except the prefetching cache) as well
> as support for set/get binding of the current thread or of another
> thread. No process-wide binding since I don't know how to iterate
> over all threads of a process.
>
> A tarball is available at:
>
> https://ci.inria.fr/hwloc/job/hwloc-zcustom-tarball/lastSuccessfulBuild/artifact/hwloc-1.7a1r5312.tar.gz
> (this is our new regression testing tool, I hope the tarball won't
> disappear too soon)
>
> I don't expect a lot more features so this branch will likely go
> into trunk very soon. But if you can look at it, that'll be great.
>
>
> Brice
>
>
>
> Le 08/01/2013 18:06, Erik Schnetter a écrit :
>> I am trying to use hwloc on a Blue Gene/Q. Building and
>> installing worked fine, and it reports the system configuration
>> fine as well (i.e. it shows all PUs). However, when I try to
>> inquire the thread/core bindings, hwloc crashes with an error in
>> libc's free(). This is both with 1.6 and 1.6.1rc1.
>>
>> The error occurs apparently in CPU_FREE called from
>> hwloc_linux_find_kernel_nr_cpus.
>>
>> Does this ring a bell with anyone? I know this is not enough
>> information to debug things, but do you have any pointers for
>> things to look at?
>>
>> I remember reading somewhere that the last bit in a cpu_set_t
>> cannot be used. A Blue Gene/Q has 64 PUs, and may be using 64-bit
>> integers to hold cpu_set_t data. Could this be an issue?
>>
>> My goal is to examine and experiment with thread/core bindings
>> with OpenMP to improve performance.
>>
>> -erik
>>
>> --
>> Erik Schnetter <schnetter_at_[hidden] <mailto:schnetter_at_[hidden]>>
>> http://www.perimeterinstitute.ca/personal/eschnetter/
>>
>>
>> _______________________________________________
>> hwloc-users mailing list
>> hwloc-users_at_[hidden] <mailto:hwloc-users_at_[hidden]>
>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>
>
>
>
> --
> Erik Schnetter <schnetter_at_[hidden] <mailto:schnetter_at_[hidden]>>
> http://www.perimeterinstitute.ca/personal/eschnetter/