Brian and I chatted about this on the phone today. Conclusions that
we came to:
1. We need to add a few lines of code to ensure that the MCA base
refuses to open components that have a different MCA version number
(i.e., dlopen a DSO, dlsym to get the component struct, check the
version number, if it's not the same MCA major.minor as our MCA
major.minor, dlclose it). This is easy to do; I'll add it to the hg.
2. Let's set the precedent now that changing the MCA version does
*not* force a change of all the framework version numbers. The
framework version numbers refer to their interfaces. Rather, it's a
triple of (MCA,framework,component) version numbers that uniquely
identify a component.
3. The load-time issues of mixing multiple MCA versions are solved by
points #1 and #2.
4. Leave the bump of all framework versions to 2.0 in place because a
good number of them had to be bumped anyway. We're probably bumping a
few that didn't actually need to be bumped (i.e., those that didn't
actually change since the v1.2 series), but what the heck -- most of
them have changed, and it's a bunch of work to roll all that out. So
let's just bump them, but not because we bumped the MCA version
number; rather, we bump them because we knew that most of them needed
to be bumped, but were too lazy to check and see exactly which ones
needed it (hey, let's be honest here...).
If no one has any objections to this, I'll bring this stuff into the
trunk at the original timeout -- Friday COB (i.e., tomorrow).
On Jul 21, 2008, at 8:55 PM, Jeff Squyres wrote:
> On Jul 21, 2008, at 6:57 PM, Brian W. Barrett wrote:
>> I guess I don't understand. I thought there were three versions in
>> component -- the MCA version, the framework version, and the
>> version. The first two should determine if the component can
>> safely be
>> loaded and the third is to identify the component. I agree that
>> for this
>> change (an MCA-level change), the MCA version *should* change.
>> the framework interface didn't change (well, not as a result of this
>> change), meaning that the framework version *should not* change.
>> The MCA
>> load infrastructure should see that the MCA versions don't match,
>> and not
>> load the component.
> Josh and I wrestled with this question for a bit and probably fell
> down on the side of conservatism; that's where this came from.
> There were two reasons why we went this way:
> 1. You could (for example) have a coll framework v1.2.3 component
> built with MCA v1.0.0 and the same coll framework v1.2.3 component
> built against MCA v2.0.0, and they would be different. Worse, they
> won't be "equal". Specifically, MCA 2.0.0 supports some minor
> features that v1.0.0 doesn't -- so even though you have 2 of the
> "same" component, they're not really the same. (*more on this below)
> 2. Another issue seemed pretty icky to solve, which led us to fall
> down a little heavier on the side of bumping all the framework
> version numbers. Let's say you have some Foo framework DSOs, some
> of which are MCA v1.0.0 and some of which are v2.0.0. The Foo
> framework interface is the same between the two. The MCA base can
> find/open all of them easily enough; but how do we return all the
> components to the caller? I could think of 3 ways:
> A. return multiple lists to the caller: a list of each of v1.0.0
> and v2.0.0 components. This means that every framework will need to
> handle (or be able to reject or specify to the MCA base to reject
> before even accepting as available) both MCA v1.0.0 and v2.0.0
> B. return a single list to the caller with both MCA component
> versions in the list. Pretty much the same as #1, but it scales
> better if we get in the business of changing the MCA version a lot
> (please God no); I mention it mainly for completeness.
> C. return a single list to the caller with all components
> "upgraded" to MCA v2.0. This seems like a nice solution -- a la the
> experiment we tried with coll a long time ago to prove to ourselves
> that run-time versioning could work (for those of you who don't
> remember: we had some coll v1.0.0 and some v1.1.0 components; the
> coll base transparently handled everything at run-time). However,
> there's a problem with this idea: since all frameworks use the
> component struct as a "super" for their component structs, the MCA
> base does not know the total size of the component public struct.
> So it cannot "upgrade" the MCA v1.0.0 structure in memory to a
> v2.0.0, because the v2.0.0 struct is bigger than the v1.0.0 struct.
> So we can't just magically treat everything as v2.0.0 components at
> the MCA base level; we'd have to have the frameworks transmorgify
> their own components (although we might be able to have some MCA
> base helper function that does the heavy lifting, as long as the
> framework supplied the total struct length).
> Note that all three of these solutions involves touching every
> framework in some way (although not every component).
> All that being said, I suppose there's two arguments against these
> kinds of issues:
> - this situation probably won't happen in practice (component A
> compiled against MCA v1.0.0 and against MCA v2.0.0) because we only
> distribute components as part of full OMPI releases, and therefore
> they're fairly tightly bound to their MCA version. However, for
> components that didn't change between OMPI v1.2 and v1.3, you *will*
> have this scenario, but in different OMPI installation directories
> (and therefore it pretty much doesn't matter).
> - I think the crux of Brian's argument is the framework's version
> number is identifying *the framework's* interface -- not the whole
> interface (i.e., not including the MCA base interface). From this
> perspective, it *is* independent of the MCA version number.
> Specifically: the version of the framework interface is independent
> of the binary compatibility and features issues surrounding the MCA
> So Josh and I thought we picked a solution that was clear, simple,
> and one-of-several sucky options. :-\ We could probably be
> convinced to go another way if someone has strong feelings.
> Jeff Squyres
> Cisco Systems
> devel mailing list