Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: Re: [hwloc-devel] hwloc powerpc rhel5 and power7 patch
From: Samuel Thibault (samuel.thibault_at_[hidden])
Date: 2010-09-21 05:15:42


Alexey Kardashevskiy, le Tue 21 Sep 2010 10:54:22 +0200, a écrit :
> >Oh, it's odd that it's still called "l2_cache" for L3 caches above L2,
> >too :)
> >
>
> "2" here has the meaning "one level higher" :-)

Sure, that's why I understood too, but it's still odd :)

> >We assume that the compiler can understand
> >
> > device_tree_cpus_t cpus = { .n = 0 };
> >
> >which is much clearer, please use that :)
>
> If the compiler can understand { 0 }, I would keep it as having whole
> structure filled with zeroes is safer :-)

{ .n = 0 } also fills the whole structure with zeroes. But it also is
better documentation, to explicitly say what is supposed to be zeroed
(actually the pointer also needs to be set to NULL IIRC).

> >>+ /* Add socket */
> >>+ /* -1 - to discuss */
> >>+ struct hwloc_obj *socket =
> >>hwloc_alloc_setup_object(HWLOC_OBJ_SOCKET, -1);
> >>+ socket->cpuset = hwloc_cpuset_dup(cpuset);
> >>+ hwloc_insert_object_by_cpuset(topology, socket);
> >>
> >Mmm, are we really sure that this describes sockets? How would a multicore
> >socket look like here? We should not insert sockets if we are not sure
> >which cpuset they really have (the principle of hwloc is "never lie, at
> >worse don't say anything").
> >
>
> My idea was to do as hwloc on RHEL6 and the same hardware does:
>
> Group0 cpuset 0xffffffff,0xffffffff
> NUMANode#0(local=0KB total=58458112KB) cpuset 0xffffffff nodeset
> 0x00000001
> Socket cpuset 0x0000000f
> L3Cache(4096KB line=128) cpuset 0x0000000f
> L2Cache(256KB line=128) cpuset 0x0000000f
> L1Cache(32KB line=128) cpuset 0x0000000f
> Core#0 cpuset 0x0000000f
> PU#0 cpuset 0x00000001
> PU#1 cpuset 0x00000002
> PU#2 cpuset 0x00000004
> PU#3 cpuset 0x00000008
>
> Is that ok if there is numa node, L*cache nodes and no "socket" in between?

Yes. If you don't know for sure where sockets are (and your code doesn't
show any knowledge of that), then just don't add socket objects.

> >>+ /* Add L1 cache */
> >>+ /* Ignore Instruction caches */
> >>+
> >>+ /* d-cache-block-size - ignore */
> >>+ /* d-cache-line-size - to read, in bytes */
> >>+ /* d-cache-sets - ignore */
> >>+ /* d-cache-size - to read, in bytes */
> >>+ /* d-tlb-sets - ignore */
> >>+ /* d-tlb-size - ignore, always 0 on power6 */
> >>+ /* i-cache-* and i-tlb-* represent instruction cache, ignore */
> >>+ uint32_t d_cache_line_size = 0, d_cache_size = 0;
> >>+ if ( (0 != hwloc_read_uint32(cpu,
> >>"d-cache-line-size",&d_cache_line_size, root_fd))&&
> >>+ (0 != hwloc_read_uint32(cpu, "d-cache-size",&d_cache_size,
> >>root_fd)) ) {
> >>+ struct hwloc_obj *cache =
> >>+ hwloc_alloc_setup_object(HWLOC_OBJ_CACHE, -1);
> >>+ cache->attr->cache.size = d_cache_size;.
> >>+ cache->attr->cache.depth = 1;
> >>+ cache->attr->cache.linesize = d_cache_line_size;
> >>
> >I would rather create an L1 cache object as soon as any of the cache
> >properties is there, and then fill the properties with what is actually
> >available.
> >
>
> Please explain, I did not get the point. L1 cache properties beside in a
> CPU folder while other levels cache properties are stored in their own
> folders. And I add L1 cache as soon as I find a CPU which has such
> properties.

What I mean is replace

      if ( (0 != hwloc_read_uint32(cpu, "d-cache-line-size",&d_cache_line_size, root_fd))&&
          (0 != hwloc_read_uint32(cpu, "d-cache-size",&d_cache_size,

with

      if ( (0 != hwloc_read_uint32(cpu, "d-cache-line-size",&d_cache_line_size, root_fd))||
          (0 != hwloc_read_uint32(cpu, "d-cache-size",&d_cache_size,

(i.e. replace && with ||) and fix the rest accordingly.

Samuel