What: Proposal for a new release methodology for the Open MPI Project.
Why: We have [at least] 2 competing forces in Open MPI:
- desire to release new features quickly. Fast is good.
- desire to release based on production quality. Slow is good.
The competition between these two forces has both created some
tension in the Open MPI community as well as created a Very Long
release cycle for OMPI v1.3 (yes, it was our specific and deliberate
choice to be feature driven -- but it was still verrrrry loooong).
How: Take some ideas from other well-established release paradigms,
such as:
- Linux kernel "odd/even" version number release methodology
- Red Hat/Fedora stable vs. feature releases
- Agile development models
When: For all releases after the v1.3 series (i.e., this proposal does
not include any releases in the v1.3 series).
--> Ralph and I will talk through all the details and answer any
questions on tomorrow's teleconference (Tue, 17 Feb 2009).
=
========================================================================
Details:
In v1.3, we let a lot of really good features sit in development for a
long, long time. Yes, we chose to do this and there were good reasons
for doing so, but the fact remains that we had some really good stuff
done and stable for long periods of time, but they weren't generally
available to users who wanted to use them. Even for users who are
willing to be out on the bleeding edge, trunk tarballs are just too
scary.
Given the two competing forces mentioned above (feature/fast releases
+ stable/slow releases), it seems like we really want two different --
but overlapping -- release mechanisms.
Taking inspiration from other well-established paradigms, Ralph and I
propose the following for all releases starting with v1.4.0:
- Have two concurrent release series:
1. "Super stable": for production users who care about stability
above all else. They're willing to wait long periods of time
before updating to a new version of Open MPI.
2. "Feature driven": for users who are willing to take a few chances
to get new OMPI features -- but cannot endure the chaos of
nightly trunk tarballs.
- The general idea is that a feature driven release is developed for a
while in an RM-regulated yet somewhat agile development style. When
specific criteria are met (i.e., feature complete, schedule driven,
etc.), the feature release series is morphed into a super stable
state and released. At this point, all development stops on that
release series; only bug fixes are allowed.
- RM's therefore become responsible for *two* release series: a
feature driven series and the corresponding super stable series that
emerges from it.
***KEY POINT*** This "two release" methodology allows for the release
(and real-world testing) of new features in a much more timely fashion
than our current release methodology.
Here's a crude ASCII art representation of how branches will work
using this proposal in SVN:
v1.3 series/super stable
v1.3.0 v1.3.2 v1.6.0
/----|---|-------|-----------|----
> /-|---|---|->
trunk / v1.3.1 v1.3.1 /
------------------------------------------------------------------------>
\ v1.4.0 v1.4.2 v1.4.4 ... v1.5.0 v1.5.1
\--|---|---|---|---|---|---|---|---|---------|--------|------>
v1.4.1 v1.4.3 ... now becomes
v1.4/feature driven v1.5/super stable
Here's how a typical release cycle works:
- Assume that a "super stable" version exists; a release series that
has an odd minor number: v1.3, v1.5, v1.7, ...etc.
- For this example, let's assume that the super stable is v1.3.
- Only bug fixes go into the "super stable" series.
- Plans for the next "super stable" are drawn up (v1.5 in this
example), including a list of goals, new features, a timeline, etc.
- A new feature release series is created shortly after the first
super stable release with a minor version number that is even (e.g.,
v1.4, v1.6, v1.8, ...etc.).
- In this example, the feature release series will be v1.4.
- The v1.4 branch is taken to a point with at least some degree of
stability and released as v1.4.0.
- Development on the SVN trunk continues.
- According to a public schedule (probably tied to our teleconference
schedule), the RM's will approve the moving of features and bug
fixes to the feature release.
- Rather than submitting CMRs for the Gatekeeper to move, the
"owner" of a particular feature/bug fix will be assigned a
specific time to move the item to the feature branch.
- For example, George will have from Tues-Fri to move his cool new
feature X to the v1.4 branch.
- Friday night, new 1.4 tarballs are cut and everyone's MTT tries
them out.
- Iterate for the next week or so to get the v1.4 branch stable.
- Rinse, repeat.
- Once the feature series meets certain criteria (e.g., feature
complete, timeline is met, etc.), it undergoes a period of intense
testing and debugging to achieve "super stable" status. Once "super
stable" has been reached, the branch is renamed to be "v1.5" and we
start the whole cycle again (with v1.6/v1.7).
- CMRs and Gatekeepers are used on the super stable series.
- The older super stable series (v1.3) then becomes either
unsupported or "less supported."
***KEY POINT*** That the schedule of moving features and bug fixes to
the release branch is somewhat fluid. If George doesn't have time to
move feature X in his appointed week, the RMs shuffle him back further
in the schedule and take the next item off the list. This shuffling
allows for rapid response to dynamic resource availability at each
organization.
***KEY POINT*** One of the goals of this proposal is to remove the
stigma of not getting into a given release -- because the feature
branch will have somewhat frequent releases (probably somewhere
between 1-3 months; see below). Hence, we want to try to avoid the
tendency of OMPI developers to pack in a million features right before
a release, fearing that if feature X is not included in this release,
it'll sit on the SVN trunk for a year before release.
To ground all of the above discussion in a concrete proposal:
1. Ralph and I will be responsible for the v1.4 and v1.5 series.
2. Immediately start treating the v1.3 series as "super stable,"
(although the v1.3 series is grandfathered -- George and Brad
are still the RMs of v1.3 and are not bound by this proposal).
3. (more or less) Immediately create the v1.4 branch from the SVN
trunk. Start working toward v1.4.0.
4. Ralph and I will draw up a public list of desired features for
the next "super stable" series -- v1.5. This will include what
has already happened on the trunk (and will therefore be in
v1.4.0).
5. Ralph and I will also make up a public schedule of when each
feature will move from the trunk to the v1.4 branch. As
mentioned above, this schedule is meant to be a living document;
we fully expect that scheduled items will move around as
time/resources/features shift.
6. We'll periodically release 1.4.x versions with clear delineations
of what new features are available in each. A SWAG for release
frequency will be a release every 1-3 months. It might be easier
to say that our initial intent is to release no less than once a
quarter; specific frequency will likely be determined on an
case-by-case basis.
7. Once all the v1.5 features are in the v1.4 branch (or if we run
out of time, or ...), rename it to v1.5, conduct a concerted
community effort to stabilize v1.5 to "super stable" status, and
release it.
8. Start the whole cycle again with v1.6/v1.7.
Ralph and I feel that this proposal is well-suited to the development
style of the Open MPI community. We'll describe this in detail and
answer any questions on tomorrow's teleconference.
--
Jeff Squyres
Cisco Systems
|