Open MPI logo

Hardware Locality Users' Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Users mailing list

Subject: Re: [hwloc-users] hwloc_get_latency() failures and confusion
From: Brice Goglin (Brice.Goglin_at_[hidden])
Date: 2012-08-06 18:08:23


Le 06/08/2012 23:47, Wheeler, Kyle Bruce a écrit :
> Hello,
>
> I'm failing to understand what hwloc (v1.5) is doing. I'm trying to use hwloc_get_latency() to determine the distance between two cores.
>
> The two cores are on different sockets. According to libnuma's numactl, the latency between the two sockets is 20, whereas between cores on the same socket is 10. According to hwloc-ls -v, the latency is 2.0, whereas between cores on the same socket is 1.0. Thus, I know that hwloc is getting topology information.
>
> However, programmatically, hwloc_get_latency() just returns -1. I tried using hwloc_get_whole_distance_matrix_by_depth(), and found that the distance matrix is only defined for depth 0

Hello Kyle,
The distance/latency API is indeed difficult to understand because it
tries to be (too) generic.
You should not be getting a distance matrix for depth 0 above. You get
one for depth=1 (the depth of NUMAnodes in your topology).

> which, according to hwloc_obj_type_string(hwloc_get_depth_type(topology, 0)) is "Machine". Now, the documentation for hwloc_get_whole_distance_matrix_by_depth() says it returns "a distances structure containing a matrix with all distances between all objects at the given depth". Given that I only have one object that depth 0 (just the one machine), what does this mean? If I try with depth 1 (aka "NUMANode" or HWLOC_OBJ_NODE), I get NULL back, suggesting that there is no matrix of distances between NUMANodes. Of course, that's not true; hwloc-ls reports that matrix! So what's going on here?

hwloc-ls uses hwloc_get_whole_distance_matrix_by_depth() :

    for (depth = 0; depth < topodepth; depth++) {
      distances = hwloc_get_whole_distance_matrix_by_depth(topology, depth);
      if (!distances || !distances->latency)
        continue;
      printf("latency matrix between %ss (depth %u) by %s indexes:\n",
             hwloc_obj_type_string(hwloc_get_depth_type(topology, depth)),
             depth,
             logical ? "logical" : "physical");
      hwloc_utils_print_distance_matrix(topology, hwloc_get_root_obj(topology), distances->nbobjs, depth, distances->latency, logical);
    }

So I don't see how you could be seeing something different. Can you send
your code and your XML topology?

> I would add that the hwloc_distances_s returned by hwloc_get_whole_distance_matrix_by_depth(topology, 0) is: { 0, 0, 0x0, 0, 0 }

That's strange, I need to look at this.

> And why is hwloc_get_latency() failing?

If you pass some Core objects to get_latency(), it's expected that it
fails because the topology only has latencies between NUMA nodes. You
should walk up the object parent links until you find NUMAnode objects.
We've been thinking of handling this case inside hwloc but we're not
sure it's always a good idea to do so.

We have several tickets open against the distance code. We know it's not
perfect so we'll be happy to hear your feedback. There are so many
things involved in this case that it's hard to figure out what's
actually important to users.

Brice