The likelihood of a physical meeting about this in the near future is
unlikely; I think we're all facing travel restrictions and constraints
with the holidays coming up.
How about a teleconf to discuss the following about the notifier:
- what exactly is there today
- why what is there today is the way it is
- discuss proposals on different ways to do it
More specifically, I think we all agree that the idea of an MPI
application notifying a higher-level entity when it detects errors is
a good one (e.g., on the host, or in the network, or ...). I think
that it is worth discussing in higher bandwidth so that we can avoid
email hell (I agree with Ralph; this could devolve pretty easily).
I propose any of the following times to discuss (I'll setup a phone
- Mon, Dec 8, 2pm, 3pm, or 4pm Eastern
- Tue, Dec 9, 10am, noon, 1pm, 2pm, 3pm, or 4pm Eastern
- Wed, Dec 10, any time
- Thu, Dec 11, 11am, 1pm, 2pm, 3pm, or 4pm Eastern
- Fri, Dec 12, 9am, 10am, 11am, 2pm, 3pm, or 4pm Eastern
On Dec 4, 2008, at 3:16 PM, Ralph Castain wrote:
> I'm beginning to believe that we need a design meeting specifically
> over this question. Too many unknowns exist, with significant
> potential problems lurking behind them. Frankly, this issue could
> have a major impact on how we operate, performance, and a variety of
> other factors going forward - many of which may be difficult to
> I suspect there may not be "optimal" solutions to many of these
> questions, but there certainly will be strong opinions in multiple
> As part of that discussion, I propose that we consider alternative
> methods for meeting the same overall objective - namely, reuse of
> the BTL's by another software project. For example, a simple copy-
> and-branch is the dominant method today, with patches used by both
> parties to cherry-pick the changes they want from the other code
> users. Multiple tools have been developed to support this mode of
> operation, yet we haven't discussed any of them in this context. The
> proposed approach contains a number of impacts that may be avoided
> with an alternative approach.
> Without such a meeting, I fear we are going to rapidly dissolve into
> email hell again.
> On Dec 4, 2008, at 1:07 PM, Eugene Loh wrote:
>> Richard Graham wrote:
>>> I expect this will involve some sort of well defined interface
>>> between the btls and orte, and I dont know if this will also
>>> require something like this between the btls and the pml I
>>> think that interface is rigidly enforced, but am not sure.
>> I'm probably missing the scope of what you're saying here, but it
>> raises another question in my mind. Is there today a well-defined
>> interface between the BTLs and... anything else? PML or whatever?
>> Maybe this comes back to a documentation question: do we (or will
>> we) have anything written down that says what a BTL must do, what
>> it may rely on, etc.?
>> devel mailing list
> devel mailing list