Open MPI logo

Hardware Locality Users' Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Users mailing list

Subject: Re: [hwloc-users] misleading cache size on AMD Opteron 6348?
From: Yury Vorobyov (teupollam_at_[hidden])
Date: 2014-04-01 09:47:25


Current BIOS version could be improperly detecting CPUs, which engineering
samples of 6348 (all characteristics are same).

On Tue, Apr 1, 2014 at 6:59 PM, Yury Vorobyov <teupollam_at_[hidden]> wrote:

> The BIOS has latest version. If I should check some BIOS information, I
> have access to hardware. Tell me what variables from SMBIOS you want to see?
>
>
> On Fri, Jan 31, 2014 at 1:07 PM, Brice Goglin <Brice.Goglin_at_[hidden]>wrote:
>
>> Hello,
>>
>> Your BIOS reports invalid L3 cache information. On these processors, the
>> L3 is shared by 6 cores, it covers 6 cores of an entire half-socket NUMA
>> node. But the BIOS says that some L3 are shared between 4 cores, others by
>> 6 cores. And worse it says that some L3 is shared by some cores from a NUMA
>> node and others from another NUMA nodes, which causes the error message
>> (and these L3 cannot be inserted in the topology).
>>
>> I see "AMD Eng Sample, ZS268145TCG54_32/26/20_2/16" in the processor
>> type, so it might explain why your BIOS is somehow experimental. See if you
>> can upgrade it.
>>
>> Also make sure your kernel isn't too old in case it misses L3 info for
>> these processors. At least 3.3 should be OK iirc.
>>
>> NUMA node sharing info:
>> $ cat sys/devices/system/node/node*/cpumap
>> 00000000,0000003f
>> 00000000,00000fc0
>> 00000000,0003f000
>> 00000000,00fc0000
>> $ cat sys/devices/system/cpu/cpu{?,??}/cache/index3/shared_cpu_map
>> 00000000,0000000f << wrong, should be 003f
>> 00000000,0000000f << wrong, should be 003f
>> 00000000,0000000f << wrong, should be 003f
>> 00000000,0000000f << wrong, should be 003f
>> 00000000,000003f0 <<impossible, should be 003f
>> 00000000,000003f0 <<impossible, should be 003f
>> 00000000,000003f0 <<impossible, should be 0fc0
>> 00000000,000003f0 <<impossible, should be 0fc0
>> 00000000,000003f0 <<impossible, should be 0fc0
>> 00000000,000003f0 <<impossible, should be 0fc0
>> 00000000,00000c00 <<wrong, should be 0fc0
>> 00000000,00000c00 <<wrong, should be 0fc0
>> 00000000,00003000 <<wrong, should be 003f000
>> 00000000,00003000 <<wrong, should be 003f000
>> 00000000,000fc000 <<impossible, should be 003f000
>> 00000000,000fc000 <<impossible, should be 003f000
>> 00000000,000fc000 <<impossible, should be 003f000
>> 00000000,000fc000 <<impossible, should be 003f000
>> 00000000,000fc000 <<impossible, should be 0fc0000
>> 00000000,000fc000 <<impossible, should be 0fc0000
>> 00000000,00f00000 <<wrong, should be 0fc0000
>> 00000000,00f00000 <<wrong, should be 0fc0000
>> 00000000,00f00000 <<wrong, should be 0fc0000
>> 00000000,00f00000 <<wrong, should be 0fc0000
>>
>> Brice
>>
>>
>>
>> Le 31/01/2014 03:46, Yury Vorobyov a écrit :
>>
>> I have got error about "intersecting caches".
>>
>> Info from hwloc in attachments.
>>
>> I never got this before. I use "live" builds of OpenMPI directly from
>> repo.
>>
>>
>> _______________________________________________
>> hwloc-users mailing listhwloc-users_at_[hidden]http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>>
>>
>>
>> _______________________________________________
>> hwloc-users mailing list
>> hwloc-users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>>
>
>