Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: Re: [hwloc-devel] hwloc powerpc rhel5 and power7 patch
From: Alexey Kardashevskiy (aik_at_[hidden])
Date: 2010-09-21 04:53:58


On 21/09/10 02:01, Samuel Thibault wrote:
>
>> +static int
>> +look_powerpc_device_tree_discover_cache(device_tree_cpus_t *cpus,
>> + uint32_t ibm_phandle, unsigned int *level, hwloc_cpuset_t cpuset)
>> +{
>> + int ret = -1;
>> + if ((NULL == level) || (NULL == cpuset))
>> + return ret;
>> + for (unsigned int i = 0; i< cpus->n; ++i) {
>> + if (ibm_phandle != cpus->p[i].l2_cache)
>>
> Oh, it's odd that it's still called "l2_cache" for L3 caches above L2,
> too :)
>

"2" here has the meaning "one level higher" :-)

>> + continue;
>> + if (NULL != cpus->p[i].cpuset) {
>> + hwloc_cpuset_or(cpuset, cpuset, cpus->p[i].cpuset);
>> + ret = 0;
>> + } else {
>> + ++(*level);
>> + if (0 == look_powerpc_device_tree_discover_cache(cpus,
>> + cpus->p[i].ibm_phandle, level, cpuset))
>> + ret = 0;
>> + }
>> + }
>> + return ret;
>> +}
>> +
>> +static void
>> +look_powerpc_device_tree(struct hwloc_topology *topology)
>> +{
>> + device_tree_cpus_t cpus = { 0 };
>>
> We assume that the compiler can understand
>
> device_tree_cpus_t cpus = { .n = 0 };
>
> which is much clearer, please use that :)
>

If the compiler can understand { 0 }, I would keep it as having whole
structure filled with zeroes is safer :-)

>> + const char ofroot[] = "/proc/device-tree/cpus";
>> +
>> + int root_fd = topology->backend_params.sysfs.root_fd;
>> + DIR *dt = hwloc_opendir(ofroot, root_fd);
>> + if (NULL == dt)
>> + return;
>> +
>> + struct dirent *dirent;
>> + while (NULL != (dirent = readdir(dt))) {
>> +
>> + char cpu[256];
>> + sprintf(cpu, "%s/%s", ofroot, dirent->d_name);
>> +
>> + char device_type[64];
>> + if (0> hwloc_read_str(cpu, "device_type", device_type, sizeof(device_type), root_fd))
>> + continue;
>> +
>> + uint32_t reg = -1, l2_cache = -1, ibm_phandle = -1;
>> + hwloc_read_uint32(cpu, "reg",&reg, root_fd);
>> + hwloc_read_uint32(cpu, "l2-cache",&l2_cache, root_fd);
>> + hwloc_read_uint32(cpu, "ibm,phandle",&ibm_phandle, root_fd);
>> +
>> + if (0 == strcmp(device_type, "cache")) {
>> + add_device_tree_cpus_node(&cpus, NULL, l2_cache, ibm_phandle, dirent->d_name);
>> + }
>> + else if (0 == strcmp(device_type, "cpu")) {
>> + /* Found CPU */
>> + hwloc_cpuset_t cpuset = NULL;
>> + uint32_t threads[128], nthreads = 0;
>> + nthreads = hwloc_read_raw(cpu, "ibm,ppc-interrupt-server#s",
>> + threads, sizeof(threads), root_fd) / sizeof(threads[0]);
>> + if (0 != nthreads) {
>> + cpuset = hwloc_cpuset_alloc();
>> + for (unsigned int i = 0; i< nthreads; ++i) {
>> + hwloc_cpuset_set(cpuset, threads[i]);
>> + }
>> + } else if ((unsigned int)-1 != reg) {
>> + cpuset = hwloc_cpuset_alloc();
>> + hwloc_cpuset_set(cpuset, reg);
>> + }
>> +
>> + if (NULL == cpuset) {
>> + hwloc_debug("%s has no \"reg\" property, skipping\n", cpu);
>> + continue;
>> + }
>> + add_device_tree_cpus_node(&cpus, cpuset, l2_cache, ibm_phandle, dirent->d_name);
>> +
>> + /* Add core */
>> + struct hwloc_obj *core = hwloc_alloc_setup_object(HWLOC_OBJ_CORE, reg);
>> + core->cpuset = hwloc_cpuset_dup(cpuset);
>> + hwloc_insert_object_by_cpuset(topology, core);
>> +
>> + /* Add socket */
>> + /* -1 - to discuss */
>> + struct hwloc_obj *socket = hwloc_alloc_setup_object(HWLOC_OBJ_SOCKET, -1);
>> + socket->cpuset = hwloc_cpuset_dup(cpuset);
>> + hwloc_insert_object_by_cpuset(topology, socket);
>>
> Mmm, are we really sure that this describes sockets? How would a multicore
> socket look like here? We should not insert sockets if we are not sure
> which cpuset they really have (the principle of hwloc is "never lie, at
> worse don't say anything").
>

My idea was to do as hwloc on RHEL6 and the same hardware does:

   Group0 cpuset 0xffffffff,0xffffffff
     NUMANode#0(local=0KB total=58458112KB) cpuset 0xffffffff nodeset
0x00000001
       Socket cpuset 0x0000000f
         L3Cache(4096KB line=128) cpuset 0x0000000f
           L2Cache(256KB line=128) cpuset 0x0000000f
             L1Cache(32KB line=128) cpuset 0x0000000f
               Core#0 cpuset 0x0000000f
                 PU#0 cpuset 0x00000001
                 PU#1 cpuset 0x00000002
                 PU#2 cpuset 0x00000004
                 PU#3 cpuset 0x00000008

Is that ok if there is numa node, L*cache nodes and no "socket" in between?

>> + /* Add L1 cache */
>> + /* Ignore Instruction caches */
>> +
>> + /* d-cache-block-size - ignore */
>> + /* d-cache-line-size - to read, in bytes */
>> + /* d-cache-sets - ignore */
>> + /* d-cache-size - to read, in bytes */
>> + /* d-tlb-sets - ignore */
>> + /* d-tlb-size - ignore, always 0 on power6 */
>> + /* i-cache-* and i-tlb-* represent instruction cache, ignore */
>> + uint32_t d_cache_line_size = 0, d_cache_size = 0;
>> + if ( (0 != hwloc_read_uint32(cpu, "d-cache-line-size",&d_cache_line_size, root_fd))&&
>> + (0 != hwloc_read_uint32(cpu, "d-cache-size",&d_cache_size, root_fd)) ) {
>> + struct hwloc_obj *cache =
>> + hwloc_alloc_setup_object(HWLOC_OBJ_CACHE, -1);
>> + cache->attr->cache.size = d_cache_size;.
>> + cache->attr->cache.depth = 1;
>> + cache->attr->cache.linesize = d_cache_line_size;
>>
> I would rather create an L1 cache object as soon as any of the cache
> properties is there, and then fill the properties with what is actually
> available.
>

Please explain, I did not get the point. L1 cache properties beside in a
CPU folder while other levels cache properties are stored in their own
folders. And I add L1 cache as soon as I find a CPU which has such
properties.

and I am attaching a new patch :-)