Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: [hwloc-devel] restrict branch
From: Brice Goglin (Brice.Goglin_at_[hidden])
Date: 2011-03-07 11:56:40

Le 01/03/2011 11:00, Brice Goglin a écrit :
> Also, in 1.2, we'll have a hwloc_topology_restrict() function which will
> let you load the whole machine topology and then restrict it to whatever
> part of it (a part is defined by a hwloc_cpuset_t).

The restrict may be ready for merging in the near future. The interface
looks like this:

diff --git a/include/hwloc.h b/include/hwloc.h
index b4ac277..245a780 100644
--- a/include/hwloc.h
+++ b/include/hwloc.h
@@ -783,6 +783,23 @@ HWLOC_DECLSPEC hwloc_obj_t hwloc_topology_insert_misc_object_by_cpuset(hwloc_top
 HWLOC_DECLSPEC hwloc_obj_t hwloc_topology_insert_misc_object_by_parent(hwloc_topology_t topology, hwloc_obj_t parent, const char *name);
+/** \brief Flags to be given to hwloc_topology_restrict(). */
+enum hwloc_restrict_flags_e {
+ /**< \brief Adapt distance matrices according to objects being removed during restriction.
+ * If this flag is not set, distance matrices are removed.
+ */
+/** \brief Restrict the topology to the current thread binding.
+ *
+ * Topology \p topology is modified so as to remove all objects that
+ * are not included (or partially included) in the CPU set \p cpuset.
+ *
+ * \p flags is a OR'ed set of hwloc_restrict_flags_e.
+ */
+HWLOC_DECLSPEC int hwloc_topology_restrict(hwloc_topology_t __hwloc_restrict topology, hwloc_const_cpuset_t cpuset, unsigned long flags);
 /** @} */

Other examples of restrict flags include:
* restricting memory and/or cpus only (we do both right now)
* dropping PCI devices or not
* dropping misc devices or not

> We'll need to make
> sure that you'll have everything you need to get your cpuset's
> hwloc_cpuset_t.

For the record, Bernd may use hwloc_topology_restrict() to load the
entire machine topology from XML and restrict it to the Linux cpuset
that was allocated to the current process.

Ideally, any process that runs in the Linux cpuset could:
* load the entire topology from XML (with HWLOC_THISSYSTEM=1 so that
binding still works)
* get the current process binding (should be the entire Linux cpuset
unless the process was bound to something else)
* pass it to hwloc_topology_restrict()
* export the resulting topology to XML if you want to pass the topology
to other processes
(I need to think about adding an option to lstopo to do something like this)

If the process (for instance a MPI job) is bound to a single process
within the cpuset, the above would restrict the topology to a single
CPU. So my feeling is that the batch scheduler should do the above
manually, right after creating the Linux cpuset, and before the actual
programs are launched (and possibly bounded to some processors).