Open MPI logo

Hardware Locality Users' Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Users mailing list

Subject: Re: [hwloc-users] hwloc on Blue Gene/Q?
From: Erik Schnetter (schnetter_at_[hidden])
Date: 2013-01-08 20:50:16


Jeff

Thanks, this is helpful. I am mostly interested in finding out which
threads share the D1 cache. I guess that get_bgq_core returns this
information.

Is there a way to guarantee that this association doesn't change at run
time? I guess I could just check periodically...

-erik

On Tue, Jan 8, 2013 at 5:33 PM, Jeff Hammond <jhammond_at_[hidden]> wrote:

> As a temporary, non-portable substitute for hwloc, you can use the SPI
> calls that are described on my Wiki:
> https://wiki.alcf.anl.gov/parts/index.php/Blue_Gene/Q#Node_topology.
> I presume that this is the means by which hwloc will support BGQ when
> it does.
>
> Blue Gene/Q has 16+1 cores with 4 hw threads each. Only 16 cores are
> visible to applications but as users can, in theory, run code on the
> 17th core (see
> https://wiki.alcf.anl.gov/parts/index.php/Blue_Gene/Q#17th_Core_App_Agents
> for how), it is important for these functions to return values in the
> range 0..16 and 0..67 instead of 0..15 and 0..63. I include this
> information in case users are confused about the additional range
> documented for these calls.
>
> Best,
>
> Jeff
>
> On Tue, Jan 8, 2013 at 11:10 AM, Brice Goglin <Brice.Goglin_at_[hidden]>
> wrote:
> > Hello Erik,
> > We need specific BGQ binding support, the binding API is different. Also
> we
> > don't properly detect the 16 4-way cores properly, we only only 64
> identical
> > PUs.
> > I am supposed to get a BGQ account in the near future so I hope I will
> have
> > everything working in v1.7.
> > Stay tuned
> > Brice
> >
> >
> >
> >
> > Le 08/01/2013 18:06, Erik Schnetter a écrit :
> >
> > I am trying to use hwloc on a Blue Gene/Q. Building and installing worked
> > fine, and it reports the system configuration fine as well (i.e. it shows
> > all PUs). However, when I try to inquire the thread/core bindings, hwloc
> > crashes with an error in libc's free(). This is both with 1.6 and
> 1.6.1rc1.
> >
> > The error occurs apparently in CPU_FREE called from
> > hwloc_linux_find_kernel_nr_cpus.
> >
> > Does this ring a bell with anyone? I know this is not enough information
> to
> > debug things, but do you have any pointers for things to look at?
> >
> > I remember reading somewhere that the last bit in a cpu_set_t cannot be
> > used. A Blue Gene/Q has 64 PUs, and may be using 64-bit integers to hold
> > cpu_set_t data. Could this be an issue?
> >
> > My goal is to examine and experiment with thread/core bindings with
> OpenMP
> > to improve performance.
> >
> > -erik
> >
> > --
> > Erik Schnetter <schnetter_at_[hidden]>
> > http://www.perimeterinstitute.ca/personal/eschnetter/
> >
> >
> > _______________________________________________
> > hwloc-users mailing list
> > hwloc-users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
> >
> >
> >
> > _______________________________________________
> > hwloc-users mailing list
> > hwloc-users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>
>
>
> --
> Jeff Hammond
> Argonne Leadership Computing Facility
> University of Chicago Computation Institute
> jhammond_at_[hidden] / (630) 252-5381
> http://www.linkedin.com/in/jeffhammond
> https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond
>
> _______________________________________________
> hwloc-users mailing list
> hwloc-users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>

-- 
Erik Schnetter <schnetter_at_[hidden]>
http://www.perimeterinstitute.ca/personal/eschnetter/