Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] MCA base component changes
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-07-21 20:55:28

On Jul 21, 2008, at 6:57 PM, Brian W. Barrett wrote:

> I guess I don't understand. I thought there were three versions in
> every
> component -- the MCA version, the framework version, and the component
> version. The first two should determine if the component can safely
> be
> loaded and the third is to identify the component. I agree that for
> this
> change (an MCA-level change), the MCA version *should* change.
> However,
> the framework interface didn't change (well, not as a result of this
> change), meaning that the framework version *should not* change.
> The MCA
> load infrastructure should see that the MCA versions don't match,
> and not
> load the component.

Josh and I wrestled with this question for a bit and probably fell
down on the side of conservatism; that's where this came from. There
were two reasons why we went this way:

1. You could (for example) have a coll framework v1.2.3 component
built with MCA v1.0.0 and the same coll framework v1.2.3 component
built against MCA v2.0.0, and they would be different. Worse, they
won't be "equal". Specifically, MCA 2.0.0 supports some minor
features that v1.0.0 doesn't -- so even though you have 2 of the
"same" component, they're not really the same. (*more on this below)

2. Another issue seemed pretty icky to solve, which led us to fall
down a little heavier on the side of bumping all the framework version
numbers. Let's say you have some Foo framework DSOs, some of which
are MCA v1.0.0 and some of which are v2.0.0. The Foo framework
interface is the same between the two. The MCA base can find/open all
of them easily enough; but how do we return all the components to the
caller? I could think of 3 ways:

   A. return multiple lists to the caller: a list of each of v1.0.0
and v2.0.0 components. This means that every framework will need to
handle (or be able to reject or specify to the MCA base to reject
before even accepting as available) both MCA v1.0.0 and v2.0.0

   B. return a single list to the caller with both MCA component
versions in the list. Pretty much the same as #1, but it scales
better if we get in the business of changing the MCA version a lot
(please God no); I mention it mainly for completeness.

   C. return a single list to the caller with all components
"upgraded" to MCA v2.0. This seems like a nice solution -- a la the
experiment we tried with coll a long time ago to prove to ourselves
that run-time versioning could work (for those of you who don't
remember: we had some coll v1.0.0 and some v1.1.0 components; the coll
base transparently handled everything at run-time). However, there's
a problem with this idea: since all frameworks use the component
struct as a "super" for their component structs, the MCA base does not
know the total size of the component public struct. So it cannot
"upgrade" the MCA v1.0.0 structure in memory to a v2.0.0, because the
v2.0.0 struct is bigger than the v1.0.0 struct. So we can't just
magically treat everything as v2.0.0 components at the MCA base level;
we'd have to have the frameworks transmorgify their own components
(although we might be able to have some MCA base helper function that
does the heavy lifting, as long as the framework supplied the total
struct length).

Note that all three of these solutions involves touching every
framework in some way (although not every component).

All that being said, I suppose there's two arguments against these  
kinds of issues:
- this situation probably won't happen in practice (component A  
compiled against MCA v1.0.0 and against MCA v2.0.0) because we only  
distribute components as part of full OMPI releases, and therefore  
they're fairly tightly bound to their MCA version.  However, for  
components that didn't change between OMPI v1.2 and v1.3, you *will*  
have this scenario, but in different OMPI installation directories  
(and therefore it pretty much doesn't matter).
- I think the crux of Brian's argument is the framework's version  
number is identifying *the framework's* interface -- not the whole  
interface (i.e., not including the MCA base interface).  From this  
perspective, it *is* independent of the MCA version number.   
Specifically: the version of the framework interface is independent of  
the binary compatibility and features issues surrounding the MCA base.
So Josh and I thought we picked a solution that was clear, simple, and  
one-of-several sucky options.  :-\  We could probably be convinced to  
go another way if someone has strong feelings.
Jeff Squyres
Cisco Systems