Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: Re: [hwloc-devel] Merging the PCI branch?
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2011-03-31 12:06:33


On Mar 28, 2011, at 5:26 PM, Brice Goglin wrote:

> First, to avoid breaking existing applications, I/O devices are not
> added to the topology unless a new topology flag is set. Only lstopo
> enables PCI devices by default.

Good. Although I think we should plan to make this the default in some future version (i.e., say that in the docs).

> We have 3 new object types:
> * PCI devices, with usual PCI bus IDs and link speed attributes
> * Bridges, with attributes for both sides, either host->pci or pci->pci
> bridges for now.
> * OS devices, which tell you which "ethX" interface, "sdX" block device,
> "mlx4_0" IB NIC or "dma0chan1" DMA engine channel corresponds to a PCI
> device.
>
> As shown on the attached picture, the usual I/O subtree is, from top to
> bottom:
> * some hostbridge object are attached to some "normal" object (machine
> or node)
> * a tree of bridges may be behind the hostbridge
> * pci devices are attached behind bridges
> * some pci devices contain some OS device.

How / where do these new devices show up in the tree that is returned from hwloc? For example, are PCI busses children of NUMA nodes, or siblings?

> These new objects are special:
> * They have no cpusets
> * They may appear at random places in the topology, with very different
> numbers of bridges above them. So we don't associate a "level" or a
> "depth" to these new types. If you ever need to enumerate them, use the
> new get_next_osdev() or get_next_pcidev() functions. This may need a bit
> more of documentation.
>
> libpci is needed to make this work. And only Linux gives you OS devices
> for now (we use sysfs to translate between pci devs and os devs).

Is libpci available on all platforms? Or is it only needed on Linux?

(do you need any assistance with the configury?)

> I also added some GPU-related OS devices by looking at DRM objects
> (card0 and controlD64 in the attached picture). This only works with
> free graphics drivers. Ideally we would have some Cuda or OpenCL device
> ID there, but we'll likely need some specific plugins to do so. I don't
> know if the current DRM objects are useful, we'll be able to remove them
> later if needed.

We should ping Intel, NVIDIA, ... others for assistance with this.

> --- a/include/hwloc.h
> +++ b/include/hwloc.h
> @@ -191,6 +191,17 @@ typedef enum {
> * Objects without particular meaning, that can e.g. be
> * added by the application for its own use.
> */
> +
> + HWLOC_OBJ_BRIDGE, /**< \brief Bridge.
> + * Any bridge that connects the host or an I/O bus,
> + * to another I/O bus.
> + */
> + HWLOC_OBJ_PCI_DEVICE, /**< \brief PCI device.
> + */
> +
> + HWLOC_OBJ_OS_DEVICE, /**< \brief Operating system device.
> + */
> +
> HWLOC_OBJ_MAX /**< \private Sentinel value */
>
> /* ***************************************************************
> @@ -226,6 +237,20 @@ enum hwloc_compare_types_e {
> HWLOC_TYPE_UNORDERED = INT_MAX /**< \brief Value returned by hwloc_compare_types when types can not be compared. \hideinitializer */
> };
>
> +
> +typedef enum hwloc_obj_bridge_type_e {
> + HWLOC_OBJ_BRIDGE_HOST, /**< \brief Host-side of a bridge, only possible upstream. */
> + HWLOC_OBJ_BRIDGE_PCI /**< \brief PCI-side of a bridge. */
> +} hwloc_obj_bridge_type_t;
> +
> +typedef enum hwloc_obj_osdev_type_e {
> + HWLOC_OBJ_OSDEV_BLOCK, /**< \brief Operating system block device. */
> + HWLOC_OBJ_OSDEV_GPU, /**< \brief Operating system GPU device. */
> + HWLOC_OBJ_OSDEV_NETWORK, /**< \brief Operating system network device. */
> + HWLOC_OBJ_OSDEV_INFINIBAND, /**< \brief Operating system infiniband device. */
> + HWLOC_OBJ_OSDEV_DMA /**< \brief Operating system dma device. */
> +} hwloc_obj_osdev_type_t;
> +
> /** @} */

Do iWARP and RoCE devices show up, too? I.e., should it be "INFINIBAND" or "OPENFABRICS"?

> @@ -403,6 +428,34 @@ union hwloc_obj_attr_u {
> struct hwloc_group_attr_s {
> unsigned depth; /**< \brief Depth of group object */
> } group;
> + /** \brief PCI Device specific Object Attributes */
> + struct hwloc_pcidev_attr_u {
> + unsigned short domain;
> + unsigned char bus, dev, func;
> + unsigned short class_id;
> + unsigned short vendor_id, device_id, subvendor_id, subdevice_id;
> + unsigned char revision;
> + float linkspeed; /* in GB/s */
> + } pcidev;
> + /** \brief Bridge specific Object Attribues */
> + struct hwloc_bridge_attr_u {
> + union hwloc_bridge_upstream_attr_u {
> + struct hwloc_pcidev_attr_u pci;
> + } upstream;
> + hwloc_obj_bridge_type_t upstream_type;
> + union hwloc_bridge_downstream_attr_u {
> + struct hwloc_bridge_downstream_pci_attr_u {
> + unsigned short domain;
> + unsigned char secondary_bus, subordinate_bus;
> + } pci;
> + } downstream;
> + hwloc_obj_bridge_type_t downstream_type;
> + unsigned depth;
> + } bridge;
> + /** \brief OS Device specific Object Attributes */
> + struct hwloc_osdev_attr_u {
> + hwloc_obj_osdev_type_t type;
> + } osdev;
> };
>
> /** \brief Distances between objects
>
> /** \brief Restrict the topology to the given CPU set.
> @@ -1675,6 +1770,27 @@ HWLOC_DECLSPEC int hwloc_free(hwloc_topology_t topology, void *addr, size_t len)
> /** @} */
>
>
> +
> +/** \defgroup hwlocality_iodev Basic I/O Device Management
> + * @{
> + */
> +
> +/** \brief Get the next PCI device in the system.
> + *
> + * \return the first PCI device if \p prev is \c NULL.
> + */
> +HWLOC_DECLSPEC struct hwloc_obj * hwloc_get_next_pcidev(struct hwloc_topology *topology, struct hwloc_obj *prev);
> +
> +/** \brief Get the next OS device in the system.
> + *
> + * \return the first OS device if \p prev is \c NULL.
> + */
> +HWLOC_DECLSPEC struct hwloc_obj * hwloc_get_next_osdev(struct hwloc_topology *topology, struct hwloc_obj *prev);
> +
> +/** @} */
> +
> +
> +
> #ifdef __cplusplus
> } /* extern "C" */
> #endif
>
>
> <pci.png>_______________________________________________
> hwloc-devel mailing list
> hwloc-devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/