Open MPI logo

Hardware Locality Users' Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Users mailing list

Subject: Re: [hwloc-users] misleading cache size on AMD Opteron 6348?
From: Brice Goglin (Brice.Goglin_at_[hidden])
Date: 2014-01-31 02:07:11


Hello,

Your BIOS reports invalid L3 cache information. On these processors, the
L3 is shared by 6 cores, it covers 6 cores of an entire half-socket NUMA
node. But the BIOS says that some L3 are shared between 4 cores, others
by 6 cores. And worse it says that some L3 is shared by some cores from
a NUMA node and others from another NUMA nodes, which causes the error
message (and these L3 cannot be inserted in the topology).

I see "AMD Eng Sample, ZS268145TCG54_32/26/20_2/16" in the processor
type, so it might explain why your BIOS is somehow experimental. See if
you can upgrade it.

Also make sure your kernel isn't too old in case it misses L3 info for
these processors. At least 3.3 should be OK iirc.

NUMA node sharing info:
$ cat sys/devices/system/node/node*/cpumap
00000000,0000003f
00000000,00000fc0
00000000,0003f000
00000000,00fc0000
$ cat sys/devices/system/cpu/cpu{?,??}/cache/index3/shared_cpu_map
00000000,0000000f << wrong, should be 003f
00000000,0000000f << wrong, should be 003f
00000000,0000000f << wrong, should be 003f
00000000,0000000f << wrong, should be 003f
00000000,000003f0 <<impossible, should be 003f
00000000,000003f0 <<impossible, should be 003f
00000000,000003f0 <<impossible, should be 0fc0
00000000,000003f0 <<impossible, should be 0fc0
00000000,000003f0 <<impossible, should be 0fc0
00000000,000003f0 <<impossible, should be 0fc0
00000000,00000c00 <<wrong, should be 0fc0
00000000,00000c00 <<wrong, should be 0fc0
00000000,00003000 <<wrong, should be 003f000
00000000,00003000 <<wrong, should be 003f000
00000000,000fc000 <<impossible, should be 003f000
00000000,000fc000 <<impossible, should be 003f000
00000000,000fc000 <<impossible, should be 003f000
00000000,000fc000 <<impossible, should be 003f000
00000000,000fc000 <<impossible, should be 0fc0000
00000000,000fc000 <<impossible, should be 0fc0000
00000000,00f00000 <<wrong, should be 0fc0000
00000000,00f00000 <<wrong, should be 0fc0000
00000000,00f00000 <<wrong, should be 0fc0000
00000000,00f00000 <<wrong, should be 0fc0000

Brice

Le 31/01/2014 03:46, Yury Vorobyov a écrit :
> I have got error about "intersecting caches".
>
> Info from hwloc in attachments.
>
> I never got this before. I use "live" builds of OpenMPI directly from
> repo.
>
>
> _______________________________________________
> hwloc-users mailing list
> hwloc-users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users