I'm not sure how many apps would benefit, but we are always interested in
taking back patches that extend the ability for researchers to explore new
capabilities provided they don't impact performance (or can be configured
out if they do) and are self-maintained (i.e., either the researcher agrees
to maintain them, or - as in this case - they involve a change that
essentially requires no ongoing maintenance).
So if you want to take a crack at this, I'd suggest taking one or two of
most interest and sending us the required patch for review. If it looks
like things fit well, then we (a) can absorb the patches, and (b) it would
probably be worth your time to submit a contributor agreement and join the
team as full committers.
On Wed, Dec 4, 2013 at 3:25 AM, "Isaías A. Comprés Ureña" <
> Dear Jeff Squyres,
> On 12/03/2013 11:27 PM, Jeff Squyres (jsquyres) wrote:
>> I'm sorry; I really wasn't paying attention to my email the week of SC,
>> and then I was on vacation for the Thanksgiving holiday. :-\
>> More below.
>> On Nov 20, 2013, at 4:13 PM, Compres <compresu_at_[hidden]> wrote:
>> I was at the birds of a feather and wanted to talk to the Open MPI
>>> developers, but unfortunately had to leave early. In particular, I would
>>> like to discuss about your implementation of the MPI tools interface and
>>> possibly contribute to it later on.
>> Sorry we missed you.
> No problem; I had to be at a booth during times that overlapped with your
> What did you want to discuss? We actually have a full implementation of
>> the MPI_T interface -- meaning that we have all the infrastructure in place
>> for MPI_T control and performance variables.
>> 1. The MPI_T control variables map directly to OMPI's MCA params, so we
>> automatically expose oodles of cvars through MPI_T. They're all read-only
>> after MPI_INIT, however -- many things are setup during MPI_INIT and it
>> would be quite a Big Deal if they were to change. However, we pretty much
>> *assumed* all cvars shouldn't change after INIT -- we didn't really audit
>> to see if there were actually some cvars that could change after INIT. So
>> there's work that could be done there (i.e., find cvars that could change
>> after INIT, and/or evaluate the amount of work/change it would be to change
>> some read-only cvars to be read-write, etc.).
>> 2. The MPI_T performance variables are new. There's only a few created
>> right now (e.g., in the Cisco usnic BTL). But the field is pretty wide
>> open here -- the infrastructure is there, but we're really not exposing
>> much information yet. There's lots that can be done here.
>> What did you have in mind?
> I think you made a good guess on what we would like to do here. We are
> working on automatic tuning based on both modeling and empirical data. One
> of our aims is to accelerate the data collection part (in this case related
> to MPI settings), by doing it online without the need of full application
> runs or restarts.
> Right now we can modify MPI runtime parameters with IBM-MPI or Open MPI.
> These require full restarts, since they are set as environment variables
> and are not modifiable after MPI_INIT. With your MPIT implementation, we
> can do the same programmatically but cannot avoid the restarts or full runs.
> We already did what you describe at the end of 1., but with a (1 year old)
> snapshot of MPICH. The idea was to identify which variables could be made
> modifiable at runtime, and whether there was any attainable performance as
> a result of tuning them. We only explored point to point and collective
> communication parameters, and the results are encouraging. There was no
> technical reason when picking MPICH for the first prototype.
> With MPICH, we had to examine the code for things that were configurable.
> It seems to me that in the case of Open MPI, most of the work is done and,
> as you point out, it may just be necessary to identify which ones can be
> made modifiable at runtime and at what development cost.
> My main intention here is to see if other people are interested and will
> benefit from this. Additionally, if the changes (patches) are taken by the
> project, we avoid running out of sync (which is what ended up happening
> with our MPICH modifications).
> - Isaías A. Comprés
> devel mailing list