Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: Re: [hwloc-devel] [hwloc-svn] svn:hwloc r3709
From: Brice Goglin (Brice.Goglin_at_[hidden])
Date: 2011-08-29 04:12:00


Some notes about this commit:

The interface already supports having multiple distance matrices for the
same type. For instance, if you have distances between NUMA nodes of a
single machine, once you assemble multiple machines, you get multiple
distance submatrices. However, loading from XML is the only way to get
multiple submatrices for the same type.

Since v1.3, hwloc_topology_set_distance_matrix() will replace the
previous matrix for the given type (or remove it if passing a NULL
matrix). So if you want to set multiple submatrices, you have to
assemble them into a global matrix. This shouldn't be a problem in most
cases, except if there are collisions between the physical indexes of
the submatrices (distances between NUMA nodes #0 and #1 of two machines
become distances between NUMA nodes #0, #1, #0 and #1... does not work).
To workaround this, people should add distances before agregating
multiple machines into a single topology. I don't think that's a very
big deal.

We could:
* Allow to set multiple submatrices with
hwloc_topology_set_distance_matrix(). We wouldn't "replace" anymore. And
giving a NULL matrix would remove all existing matrices for the given
type. Changing this behavior should be done *before v1.3 final* because
the replace/remove features were added in v1.3.
* Add a set_distances() variant taking logical indexes. Can be added
later easily. Would need careful documentation because adding a distance
matrix can cause grouping which would change logical indexes.

I am rethinking all this because I am looking at adding throughput
matrices. So we may have multiple matrices per type anyway, but they
will contain different types of information. If we're adding a new
set_distances() variant, we may add a parameter specifying whether the
values are latencies or throughputs for instance.

Brice

Le 29/08/2011 09:50, bgoglin_at_[hidden] a écrit :
> Author: bgoglin
> Date: 2011-08-29 03:50:07 EDT (Mon, 29 Aug 2011)
> New Revision: 3709
> URL: https://svn.open-mpi.org/trac/hwloc/changeset/3709
>
> Log:
> Clarify distances doc (and make it more future proof)
> Text files modified:
> trunk/include/hwloc.h | 9 +++++----
> 1 files changed, 5 insertions(+), 4 deletions(-)
>
> Modified: trunk/include/hwloc.h
> ==============================================================================
> --- trunk/include/hwloc.h (original)
> +++ trunk/include/hwloc.h 2011-08-29 03:50:07 EDT (Mon, 29 Aug 2011)
> @@ -482,12 +482,12 @@
> * containing object is the root object of the topology, then the
> * distances are available for all objects in the machine.
> *
> - * The distance may be a memory latency, as defined by the ACPI SLIT
> - * specification. If so, the \p latency pointer will not be \c NULL
> - * and the pointed array will contain non-zero values.
> + * If the \p latency pointer is not \c NULL, the pointed array contains
> + * memory latencies (non-zero values), as defined by the ACPI SLIT
> + * specification.
> *
> * In the future, some other types of distances may be considered.
> - * In these cases, \p latency will be \c NULL.
> + * In these cases, \p latency may be \c NULL.
> */
> struct hwloc_distances_s {
> unsigned relative_depth; /**< \brief Relative depth of the considered objects
> @@ -780,6 +780,7 @@
> * array. The \p distances matrix follows the same order.
> * The distance from object i to object j in the i*nbobjs+j.
> *
> + * A single latency matrix may be defined for each type.
> * If another distance matrix already exists for the given type,
> * either because the user specified it or because the OS offers it,
> * it will be replaced by the given one.
> _______________________________________________
> hwloc-svn mailing list
> hwloc-svn_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-svn