Open MPI logo

Hardware Locality Users' Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Users mailing list

Subject: Re: [hwloc-users] BGQ question.
From: Biddiscombe, John A. (biddisco_at_[hidden])
Date: 2014-03-25 04:56:02


Brice

Looking at /proc/cpuinfo on the io node itself, I see only 60 cores listed. I wonder if they’ve reserved one socket of 4 cores for IO purposes and in fact hwloc is seeing the correct information.

Attached is the foo zip of the run just now (assuming it doesn’t bounce)

JB

From: Brice Goglin [mailto:Brice.Goglin_at_[hidden]]
Sent: 25 March 2014 09:28
To: Biddiscombe, John A.; Hardware locality user list
Subject: Re: [hwloc-users] BGQ question.

Can you run hwloc-gather-topology foo and send the resulting foo.tar.bz2 ?
If the tarball is too bug, feel free to send it to me in a private mail.

Brice



Le 25/03/2014 08:55, Biddiscombe, John A. a écrit :
Brice,

Correct : The IO nodes are running a full linux install (RHE 6.4) on the same hardware as the CNK nodes.

On vesta I do not have an account and I am not certain the IO nodes are available for direct login. I’m using the BGQ at CSCS which is an EPFL machine. The IO nodes are open for some special projects where we are trying to customise the IO.

JB

From: Brice Goglin [mailto:brice.goglin_at_[hidden]]
Sent: 25 March 2014 08:43
To: Hardware locality user list; Biddiscombe, John A.
Subject: Re: [hwloc-users] BGQ question.

Wait, I missed the "io node" part of your first mail. The bgq support is for compute nodes running cnk. Are io nodes running linux on same hardware as the compute nodes?

I have an account on vesta. Where should I logon to have a look?
Brice


On 25 mars 2014 08:12:58 UTC+01:00, "Biddiscombe, John A." <biddisco_at_[hidden]<mailto:biddisco_at_[hidden]>> wrote:
Brice,


lstopo --whole-system


gives the same output and setting env var BG_THREADMODEL=2 does not appear to make any visible difference.


my configure command for compiling hwloc had no special options,
./configure --prefix=/gpfs/bbp.cscs.ch/home/biddisco/apps/clang/hwloc-1.8.1


should I rerun with something set?


Thanks


JB




From: hwloc-users [mailto:hwloc-users-bounces_at_[hidden]] On Behalf Of Brice Goglin
Sent: 25 March 2014 08:04
To: Hardware locality user list
Subject: Re: [hwloc-users] BGQ question.


Le 25/03/2014 07:51, Biddiscombe, John A. a écrit :
I’m compiling hwloc using clang (bgclang++11 from ANL) to run on IO nodes af a BGQ. It seems to have compiled ok, and when I run lstopo, I get an output like this (below), which looks reasonable, but there are 15 sockets instead of 16. I’m a little worried because the first time I compiled, I had problems where apps would report an error from HWLOC on start and tell me to set HWLOC_FORCE_BGQ=1. when I did set this env var, it would then report that “topology became empty” and the app would segfault due to the unexpected return from hwloc presumably.

Can you give a bit more details on what you did there? I'd like to check if that case should be better supported or not.


I wiped everything and recompiled (not sure what I did differently), and now it behaves more sensibly, but with 15 instead of 16 sockets.

Should IO be worried?

The topology detection is hardwired so you shouldn't worried on the hardware side.
The problem could be related to how you reserved resources before running lstopo.
Does lstopo --whole-system see more sockets?
Does BG_THREADMODEL=2 help?

Brice

________________________________



hwloc-users mailing list

hwloc-users_at_[hidden]<mailto:hwloc-users_at_[hidden]>

http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users