WHAT: suggestion for how to expose multiple MPI_T pvar values for a given variable.
WHY: so that we have a common convention across OMPI (and possibly set a precedent for other MPI implementations...?).
WHERE: ompi/mca/btl/usnic, but if everyone likes it, potentially elsewhere in OMPI
TIMEOUT: before 1.7.4, so let's set a first timeout of next Tuesday teleconf (Nov 12)
Per my discussion on the call today, I'm sending the attached PPT of how we're exposing MPI_T performance variables in the usnic BTL in the multi-BTL case.
Feedback is welcome, especially because we're the first MPI implementation to expose MPI_T pvars in this way (already committed on the trunk and targeted for 1.7.4). So this methodology may well become a useful precedent.
** Issue #1: we want to expose each usnic BTL pvar (e.g., btl_usnic_num_sends) on a per-usnic-BTL-*module* basis. How to do this?
1. Add a prefix/suffix on each pvar name (e.g., btl_usnic_num_sends_0, btl_usnic_num_sends_1, ...etc.).
2. Return an array of values under the single name (btl_usnic_num_sends) -- one value for each BTL module.
We opted for the 2nd option. The MPI_T pvar interface provides a way to get the array length for a pvar, so this is all fine and good.
Specifically: btl_usnic_num_sends returns an array of N values, where N is the number of usnic BTL modules being used by the MPI process. Each slot in the array corresponds to the value from one usnic BTL module.
** Issue #2: but how do you map a given value to an underlying Linux usnic interface?
Our solution was twofold:
1. Guarantee that the ordering of values in all pvar arrays is the same (i.e., usnic BTL module 0 will always be in slot 0, usnic BTL module 1 will always be in slot 1, ...etc.).
2. Add another pvar that is an MPI_T state variable with an associated MPI_T "enumeration", which contains string names of the underlying Linux devices. This allows you to map a given value from a pvar to an underlying Linux device (e.g., from usnic BTL module 2 to /dev/usnic_3, or whatever).
See the attached PPT.
If people have no objection to this, we should use this convention across OMPI (e.g., for other BTLs that expose MPI_T pvars).
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
devel mailing list