Open MPI logo

Hardware Locality Users' Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Users mailing list

Subject: Re: [hwloc-users] hwloc on Blue Gene/Q?
From: Jeff Hammond (jhammond_at_[hidden])
Date: 2013-01-08 17:33:29


As a temporary, non-portable substitute for hwloc, you can use the SPI
calls that are described on my Wiki:
https://wiki.alcf.anl.gov/parts/index.php/Blue_Gene/Q#Node_topology.
I presume that this is the means by which hwloc will support BGQ when
it does.

Blue Gene/Q has 16+1 cores with 4 hw threads each. Only 16 cores are
visible to applications but as users can, in theory, run code on the
17th core (see https://wiki.alcf.anl.gov/parts/index.php/Blue_Gene/Q#17th_Core_App_Agents
for how), it is important for these functions to return values in the
range 0..16 and 0..67 instead of 0..15 and 0..63. I include this
information in case users are confused about the additional range
documented for these calls.

Best,

Jeff

On Tue, Jan 8, 2013 at 11:10 AM, Brice Goglin <Brice.Goglin_at_[hidden]> wrote:
> Hello Erik,
> We need specific BGQ binding support, the binding API is different. Also we
> don't properly detect the 16 4-way cores properly, we only only 64 identical
> PUs.
> I am supposed to get a BGQ account in the near future so I hope I will have
> everything working in v1.7.
> Stay tuned
> Brice
>
>
>
>
> Le 08/01/2013 18:06, Erik Schnetter a écrit :
>
> I am trying to use hwloc on a Blue Gene/Q. Building and installing worked
> fine, and it reports the system configuration fine as well (i.e. it shows
> all PUs). However, when I try to inquire the thread/core bindings, hwloc
> crashes with an error in libc's free(). This is both with 1.6 and 1.6.1rc1.
>
> The error occurs apparently in CPU_FREE called from
> hwloc_linux_find_kernel_nr_cpus.
>
> Does this ring a bell with anyone? I know this is not enough information to
> debug things, but do you have any pointers for things to look at?
>
> I remember reading somewhere that the last bit in a cpu_set_t cannot be
> used. A Blue Gene/Q has 64 PUs, and may be using 64-bit integers to hold
> cpu_set_t data. Could this be an issue?
>
> My goal is to examine and experiment with thread/core bindings with OpenMP
> to improve performance.
>
> -erik
>
> --
> Erik Schnetter <schnetter_at_[hidden]>
> http://www.perimeterinstitute.ca/personal/eschnetter/
>
>
> _______________________________________________
> hwloc-users mailing list
> hwloc-users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>
>
>
> _______________________________________________
> hwloc-users mailing list
> hwloc-users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users

-- 
Jeff Hammond
Argonne Leadership Computing Facility
University of Chicago Computation Institute
jhammond_at_[hidden] / (630) 252-5381
http://www.linkedin.com/in/jeffhammond
https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond