Subject: [OMPI devel] RFC: usnic BTL MPI_T pvar scheme
From: Jeff Squyres (jsquyres) (jsquyres_at_[hidden])
Date: 2013-11-05 17:37:19

WHAT: suggestion for how to expose multiple MPI_T pvar values for a given variable.

WHY: so that we have a common convention across OMPI (and possibly set a precedent for other MPI implementations...?).

WHERE: ompi/mca/btl/usnic, but if everyone likes it, potentially elsewhere in OMPI

TIMEOUT: before 1.7.4, so let's set a first timeout of next Tuesday teleconf (Nov 12)

More detail:

Per my discussion on the call today, I'm sending the attached PPT of how we're exposing MPI_T performance variables in the usnic BTL in the multi-BTL case.

Feedback is welcome, especially because we're the first MPI implementation to expose MPI_T pvars in this way (already committed on the trunk and targeted for 1.7.4). So this methodology may well become a useful precedent.

** Issue #1: we want to expose each usnic BTL pvar (e.g., btl_usnic_num_sends) on a per-usnic-BTL-*module* basis. How to do this?

1. Add a prefix/suffix on each pvar name (e.g., btl_usnic_num_sends_0, btl_usnic_num_sends_1, ...etc.).
2. Return an array of values under the single name (btl_usnic_num_sends) -- one value for each BTL module.

We opted for the 2nd option. The MPI_T pvar interface provides a way to get the array length for a pvar, so this is all fine and good.

Specifically: btl_usnic_num_sends returns an array of N values, where N is the number of usnic BTL modules being used by the MPI process. Each slot in the array corresponds to the value from one usnic BTL module.

** Issue #2: but how do you map a given value to an underlying Linux usnic interface?

Our solution was twofold:

1. Guarantee that the ordering of values in all pvar arrays is the same (i.e., usnic BTL module 0 will always be in slot 0, usnic BTL module 1 will always be in slot 1, ...etc.).

2. Add another pvar that is an MPI_T state variable with an associated MPI_T "enumeration", which contains string names of the underlying Linux devices. This allows you to map a given value from a pvar to an underlying Linux device (e.g., from usnic BTL module 2 to /dev/usnic_3, or whatever).

See the attached PPT.

If people have no objection to this, we should use this convention across OMPI (e.g., for other BTLs that expose MPI_T pvars).

Jeff Squyres
