Open MPI logo

Hardware Locality Users' Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Users mailing list

Subject: Re: [hwloc-users] hwloc errors on program startup
From: Brice Goglin (Brice.Goglin_at_[hidden])
Date: 2014-01-17 12:40:28


Hello,

Linux says socket 0 contains processors 0-7 and socket 1 contains 8-15,
while NUMA node 0 contains processors 0-3+8-11 and NUMA node 1 contains
processors 4-7+12-15. Given why I read about Opteron 6320 online, the
problem is that NUMA 0 should be replaced with two NUMA nodes with
processors 0-3 on one side and 8-11 on the other side, and NUMA 1 be
replaced with two NUMA nodes with processors 4-7 and 12-15 respectively.

Your grep for SRAT confirms that there are 4 NUMA nodes in the machine,
and the APIC numbers seem correctly associated.

Unfortunately, it looks like you have the last BIOS revision (1.01)
http://www.tyan.com/support_download_bios.aspx?model=B.YR190B8238
Is there any kernel update available from RHEL6 for your machine?

Brice

Le 17/01/2014 17:11, Doug Roberts a écrit :
>
> 1) We are getting hwloc topology errors when programs startup on
> some new compute nodes added into our cluster recently ...
>
> [roberpj_at_bro127:~/samples/mpi_test]
> /opt/sharcnet/openmpi/1.6.5/intel/bin/mpirun -np 2 --mca btl
> tcp,sm,self --host bro127,bro127 ./a.out
> ****************************************************************************
>
> * Hwloc has encountered what looks like an error from the operating
> system.
> *
> * object intersection without inclusion!
> * Error occurred in topology.c line 594
> *
> * Please report this error message to the hwloc user's mailing list,
> * along with the output from the hwloc-gather-topology.sh script.
> ****************************************************************************
>
> Number of processes = 2
> Test repeated 3 times for reliability
> I am process 0 on node bro127
> Run 1 of 3
> P0: Sending to P1
> I am process 1 on node bro127
> P1: Waiting to receive from to P0
> P0: Waiting to receive from P1
> P0: Received from to P1
> Run 2 of 3
> P0: Sending to P1
> P0: Waiting to receive from P1
> P0: Received from to P1
> Run 3 of 3
> P0: Sending to P1
> P0: Waiting to receive from P1
> P0: Received from to P1
> P0: Done
> P1: Sending to to P0
> P1: Waiting to receive from to P0
> P1: Sending to to P0
> P1: Waiting to receive from to P0
> P1: Sending to to P0
> P1: Done
>
> 2) Ive run hwloc-gather-topology.sh and attached bro127.tar.bz2 ...
>
> [roberpj_at_bro127:~/samples/hwloc-gather-topology]
> /home/roberpj/builds/hwloc/1.7.2/1.7.2-debug/bin/hwloc-gather-topology
> $(uname -n)
> Hierarchy gathered in ./bro127.tar.bz2 and kept in
> /tmp/tmp.Fr37QhvDGD/bro127/
> ****************************************************************************
>
> * Hwloc has encountered what looks like an error from the operating
> system.
> *
> * object (Socket P#0 cpuset 0x000000ff) intersection without inclusion!
> * Error occurred in topology.c line 718
> *
> * Please report this error message to the hwloc user's mailing list,
> * along with the output from the hwloc-gather-topology.sh script.
> ****************************************************************************
>
> Expected topology output stored in ./bro127.output
>
> [roberpj_at_bro127:~/samples/hwloc-gather-topology] cat bro127.output
> Machine (P#0 total=67106040KB DMIProductName=empty
> DMIProductVersion=empty DMIBoardVendor="TYAN Computer Corporation"
> DMIBoardName=YR190-B8238 DMIBoardVersion=empty DMIBoardAssetTag=empty
> DMIChassisVendor=empty DMIChassisType=3 DMIChassisVersion=empty
> DMIChassisAssetTag=empty DMIBIOSVendor="American Megatrends Inc."
> DMIBIOSVersion='V1.01.B10' DMIBIOSDate=09/26/2011 DMISysVendor=empty
> Backend=Linux LinuxCgroup=/)
> NUMANode L#0 (P#0 local=33551608KB total=33551608KB)
> L3Cache L#0 (size=6144KB linesize=64 ways=64)
> L2Cache L#0 (size=2048KB linesize=64 ways=16)
> L1iCache L#0 (size=64KB linesize=64 ways=2)
> L1dCache L#0 (size=16KB linesize=64 ways=4)
> Core L#0 (P#0)
> PU L#0 (P#0)
> L1dCache L#1 (size=16KB linesize=64 ways=4)
> Core L#1 (P#1)
> PU L#1 (P#1)
> L2Cache L#1 (size=2048KB linesize=64 ways=16)
> L1iCache L#1 (size=64KB linesize=64 ways=2)
> L1dCache L#2 (size=16KB linesize=64 ways=4)
> Core L#2 (P#2)
> PU L#2 (P#2)
> L1dCache L#3 (size=16KB linesize=64 ways=4)
> Core L#3 (P#3)
> PU L#3 (P#3)
> L3Cache L#1 (size=6144KB linesize=64 ways=64)
> L2Cache L#2 (size=2048KB linesize=64 ways=16)
> L1iCache L#2 (size=64KB linesize=64 ways=2)
> L1dCache L#4 (size=16KB linesize=64 ways=4)
> Core L#4 (P#0)
> PU L#4 (P#8)
> L1dCache L#5 (size=16KB linesize=64 ways=4)
> Core L#5 (P#1)
> PU L#5 (P#9)
> L2Cache L#3 (size=2048KB linesize=64 ways=16)
> L1iCache L#3 (size=64KB linesize=64 ways=2)
> L1dCache L#6 (size=16KB linesize=64 ways=4)
> Core L#6 (P#2)
> PU L#6 (P#10)
> L1dCache L#7 (size=16KB linesize=64 ways=4)
> Core L#7 (P#3)
> PU L#7 (P#11)
> NUMANode L#1 (P#1 local=33554432KB total=33554432KB)
> L3Cache L#2 (size=6144KB linesize=64 ways=64)
> L2Cache L#4 (size=2048KB linesize=64 ways=16)
> L1iCache L#4 (size=64KB linesize=64 ways=2)
> L1dCache L#8 (size=16KB linesize=64 ways=4)
> Core L#8 (P#0)
> PU L#8 (P#4)
> L1dCache L#9 (size=16KB linesize=64 ways=4)
> Core L#9 (P#1)
> PU L#9 (P#5)
> L2Cache L#5 (size=2048KB linesize=64 ways=16)
> L1iCache L#5 (size=64KB linesize=64 ways=2)
> L1dCache L#10 (size=16KB linesize=64 ways=4)
> Core L#10 (P#2)
> PU L#10 (P#6)
> L1dCache L#11 (size=16KB linesize=64 ways=4)
> Core L#11 (P#3)
> PU L#11 (P#7)
> L3Cache L#3 (size=6144KB linesize=64 ways=64)
> L2Cache L#6 (size=2048KB linesize=64 ways=16)
> L1iCache L#6 (size=64KB linesize=64 ways=2)
> L1dCache L#12 (size=16KB linesize=64 ways=4)
> Core L#12 (P#0)
> PU L#12 (P#12)
> L1dCache L#13 (size=16KB linesize=64 ways=4)
> Core L#13 (P#1)
> PU L#13 (P#13)
> L2Cache L#7 (size=2048KB linesize=64 ways=16)
> L1iCache L#7 (size=64KB linesize=64 ways=2)
> L1dCache L#14 (size=16KB linesize=64 ways=4)
> Core L#14 (P#2)
> PU L#14 (P#14)
> L1dCache L#15 (size=16KB linesize=64 ways=4)
> Core L#15 (P#3)
> PU L#15 (P#15)
> depth 0: 1 Machine (type #1)
> depth 1: 2 NUMANode (type #2)
> depth 2: 4 L3Cache (type #4)
> depth 3: 8 L2Cache (type #4)
> depth 4: 8 L1iCache (type #4)
> depth 5: 16 L1dCache (type #4)
> depth 6: 16 Core (type #5)
> depth 7: 16 PU (type #6)
> latency matrix between NUMANodes (depth 1) by logical indexes:
> index 0 1
> 0 1.000 1.600
> 1 1.600 1.000
> Topology not from this system
>
> 3) SRAT dmesg output was mentioned in another similar ticket
> http://www.open-mpi.org/community/lists/hwloc-users/2012/05/0639.php
> so i am including ours here also ...
>
> [roberpj_at_bro127:~] dmesg | grep SRAT
> ACPI: SRAT 00000000dfdba570 001D0 (v02 AMD AGESA 00000001 AMD
> 00000001)
> SRAT: PXM 0 -> APIC 32 -> Node 0
> SRAT: PXM 0 -> APIC 33 -> Node 0
> SRAT: PXM 0 -> APIC 34 -> Node 0
> SRAT: PXM 0 -> APIC 35 -> Node 0
> SRAT: PXM 1 -> APIC 36 -> Node 1
> SRAT: PXM 1 -> APIC 37 -> Node 1
> SRAT: PXM 1 -> APIC 38 -> Node 1
> SRAT: PXM 1 -> APIC 39 -> Node 1
> SRAT: PXM 2 -> APIC 64 -> Node 2
> SRAT: PXM 2 -> APIC 65 -> Node 2
> SRAT: PXM 2 -> APIC 66 -> Node 2
> SRAT: PXM 2 -> APIC 67 -> Node 2
> SRAT: PXM 3 -> APIC 68 -> Node 3
> SRAT: PXM 3 -> APIC 69 -> Node 3
> SRAT: PXM 3 -> APIC 70 -> Node 3
> SRAT: PXM 3 -> APIC 71 -> Node 3
> SRAT: Node 0 PXM 0 0-a0000
> SRAT: Node 0 PXM 0 100000-e0000000
> SRAT: Node 0 PXM 0 100000000-820000000
> SRAT: Node 1 PXM 1 820000000-1020000000
>
> 4) Note the nodes have a 10GE interface on eth2 ...
>
> [root_at_bro127:~] nano /var/log/messages (snip)
> Jan 15 16:03:55 bro127 kernel: ADDRCONF(NETDEV_UP): eth2: link is not
> ready
> Jan 15 16:03:55 bro127 kernel: ixgbe 0000:04:00.0: eth2: changing MTU
> from 1500 to 8000
> Jan 15 16:03:55 bro127 kernel: ixgbe 0000:04:00.0: eth2: detected SFP+: 3
> Jan 15 16:03:55 bro127 kernel: SoftIWARP attached
> Jan 15 16:03:55 bro127 kernel: ixgbe 0000:04:00.0: eth2: detected SFP+: 3
> Jan 15 16:03:55 bro127 kernel: ixgbe 0000:04:00.0: eth2: NIC Link is
> Up 10 Gbps, Flow Control: RX/TX
> Jan 15 16:03:55 bro127 kernel: ADDRCONF(NETDEV_CHANGE): eth2: link
> becomes ready
>
> [roberpj_at_bro127:~] modinfo ixgbe
> filename:
> /lib/modules/2.6.32-279.5.2.el6.x86_64/kernel/drivers/net/ixgbe/ixgbe.ko
> version: 3.6.7-k
> license: GPL
> description: Intel(R) 10 Gigabit PCI Express Network Driver
> author: Intel Corporation, <linux.nics_at_[hidden]>
> srcversion: EC64C3345C7AC6AB4BD6F5C
> alias: pci: v00008086d0000154Asv*sd*bc*sc*i*
> alias: pci: v00008086d00001557sv*sd*bc*sc*i*
> alias: pci: v00008086d0000154Fsv*sd*bc*sc*i*
> alias: pci: v00008086d0000154Dsv*sd*bc*sc*i*
> alias: pci: v00008086d00001528sv*sd*bc*sc*i*
> alias: pci: v00008086d000010F8sv*sd*bc*sc*i*
> alias: pci: v00008086d0000151Csv*sd*bc*sc*i*
> alias: pci: v00008086d00001529sv*sd*bc*sc*i*
> alias: pci: v00008086d0000152Asv*sd*bc*sc*i*
> alias: pci: v00008086d000010F9sv*sd*bc*sc*i*
> alias: pci: v00008086d00001514sv*sd*bc*sc*i*
> alias: pci: v00008086d00001507sv*sd*bc*sc*i*
> alias: pci: v00008086d000010FBsv*sd*bc*sc*i*
> alias: pci: v00008086d00001517sv*sd*bc*sc*i*
> alias: pci: v00008086d000010FCsv*sd*bc*sc*i*
> alias: pci: v00008086d000010F7sv*sd*bc*sc*i*
> alias: pci: v00008086d00001508sv*sd*bc*sc*i*
> alias: pci: v00008086d000010DBsv*sd*bc*sc*i*
> alias: pci: v00008086d000010F4sv*sd*bc*sc*i*
> alias: pci: v00008086d000010E1sv*sd*bc*sc*i*
> alias: pci: v00008086d000010F1sv*sd*bc*sc*i*
> alias: pci: v00008086d000010ECsv*sd*bc*sc*i*
> alias: pci: v00008086d000010DDsv*sd*bc*sc*i*
> alias: pci: v00008086d0000150Bsv*sd*bc*sc*i*
> alias: pci: v00008086d000010C8sv*sd*bc*sc*i*
> alias: pci: v00008086d000010C7sv*sd*bc*sc*i*
> alias: pci: v00008086d000010C6sv*sd*bc*sc*i*
> alias: pci: v00008086d000010B6sv*sd*bc*sc*i*
> depends: mdio,dca
> vermagic: 2.6.32-279.5.2.el6.x86_64 SMP mod_unload modversions
> parm: IntMode:Change Interrupt Mode (0=Legacy, 1=MSI,
> 2=MSI-X), default 2 (array of int)
> parm: FdirMode:Flow Director filtering modes (0=Off,
> 1=Hashing) default 1 (array of int)
> parm: max_vfs:Maximum number of virtual functions to
> allocate per physical function (uint)
> parm: allow_unsupported_sfp:Allow unsupported and untested
> SFP+ modules on 82599-based adapters (uint)
>
>
> _______________________________________________
> hwloc-users mailing list
> hwloc-users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users