Open MPI logo

Hardware Locality Users' Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Users mailing list

Subject: Re: [hwloc-users] Strange binding issue on 40 core nodes and cgroups
From: Christopher Samuel (samuel_at_[hidden])
Date: 2012-11-05 21:00:59


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 06/11/12 08:57, Brock Palen wrote:

> Ok more information (had to build newer hwloc) My job today only
> 2 processes are running at half speed and they indeed are sharing
> the same core:

We've seen the same occasionally using CentOS5/RHEL5 with jobs running
under Torque with cpusets enabled.

Never been able to explain it and the most recent case was someone
using a home compiled version of NAMD, the problem disappeared when
they started using our provided builds.

I was fixing up the running problem jobs by hand by assigning procs to
individual cores on the nodes with cpusets. :-/

cheers,
Chris
- --
 Christopher Samuel Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: samuel_at_[hidden] Phone: +61 (0)3 903 55545
 http://www.vlsci.org.au/ http://twitter.com/vlsci

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://www.enigmail.net/

iEYEARECAAYFAlCYb1sACgkQO2KABBYQAh/OGACeNL7bow7z26El31zIg16q+tPw
toIAnigL5SHhZXM42DGY3M2Ewt6PUNIk
=/bNA
-----END PGP SIGNATURE-----