Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] RFC: New Open MPI release procedure
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2009-02-16 10:47:42


What: Proposal for a new release methodology for the Open MPI Project.

Why: We have [at least] 2 competing forces in Open MPI:
   - desire to release new features quickly. Fast is good.
   - desire to release based on production quality. Slow is good.

   The competition between these two forces has both created some
   tension in the Open MPI community as well as created a Very Long
   release cycle for OMPI v1.3 (yes, it was our specific and deliberate
   choice to be feature driven -- but it was still verrrrry loooong).

How: Take some ideas from other well-established release paradigms,
such as:
   - Linux kernel "odd/even" version number release methodology
   - Red Hat/Fedora stable vs. feature releases
   - Agile development models

When: For all releases after the v1.3 series (i.e., this proposal does
not include any releases in the v1.3 series).

--> Ralph and I will talk through all the details and answer any
     questions on tomorrow's teleconference (Tue, 17 Feb 2009).

=
========================================================================

Details:

In v1.3, we let a lot of really good features sit in development for a
long, long time. Yes, we chose to do this and there were good reasons
for doing so, but the fact remains that we had some really good stuff
done and stable for long periods of time, but they weren't generally
available to users who wanted to use them. Even for users who are
willing to be out on the bleeding edge, trunk tarballs are just too
scary.

Given the two competing forces mentioned above (feature/fast releases
+ stable/slow releases), it seems like we really want two different --
but overlapping -- release mechanisms.

Taking inspiration from other well-established paradigms, Ralph and I
propose the following for all releases starting with v1.4.0:

- Have two concurrent release series:
   1. "Super stable": for production users who care about stability
      above all else. They're willing to wait long periods of time
      before updating to a new version of Open MPI.
   2. "Feature driven": for users who are willing to take a few chances
      to get new OMPI features -- but cannot endure the chaos of
      nightly trunk tarballs.

- The general idea is that a feature driven release is developed for a
   while in an RM-regulated yet somewhat agile development style. When
   specific criteria are met (i.e., feature complete, schedule driven,
   etc.), the feature release series is morphed into a super stable
   state and released. At this point, all development stops on that
   release series; only bug fixes are allowed.
- RM's therefore become responsible for *two* release series: a
   feature driven series and the corresponding super stable series that
   emerges from it.

***KEY POINT*** This "two release" methodology allows for the release
(and real-world testing) of new features in a much more timely fashion
than our current release methodology.

Here's a crude ASCII art representation of how branches will work
using this proposal in SVN:

           v1.3 series/super stable
              v1.3.0 v1.3.2 v1.6.0
         /----|---|-------|-----------|----
> /-|---|---|->
trunk / v1.3.1 v1.3.1 /
------------------------------------------------------------------------>
           \ v1.4.0 v1.4.2 v1.4.4 ... v1.5.0 v1.5.1
             
\--|---|---|---|---|---|---|---|---|---------|--------|------>
                   v1.4.1 v1.4.3 ... now becomes
              v1.4/feature driven v1.5/super stable

Here's how a typical release cycle works:

- Assume that a "super stable" version exists; a release series that
   has an odd minor number: v1.3, v1.5, v1.7, ...etc.
- For this example, let's assume that the super stable is v1.3.
- Only bug fixes go into the "super stable" series.

- Plans for the next "super stable" are drawn up (v1.5 in this
   example), including a list of goals, new features, a timeline, etc.

- A new feature release series is created shortly after the first
   super stable release with a minor version number that is even (e.g.,
   v1.4, v1.6, v1.8, ...etc.).
- In this example, the feature release series will be v1.4.
- The v1.4 branch is taken to a point with at least some degree of
   stability and released as v1.4.0.

- Development on the SVN trunk continues.

- According to a public schedule (probably tied to our teleconference
   schedule), the RM's will approve the moving of features and bug
   fixes to the feature release.
   - Rather than submitting CMRs for the Gatekeeper to move, the
     "owner" of a particular feature/bug fix will be assigned a
     specific time to move the item to the feature branch.
   - For example, George will have from Tues-Fri to move his cool new
     feature X to the v1.4 branch.
   - Friday night, new 1.4 tarballs are cut and everyone's MTT tries
     them out.
   - Iterate for the next week or so to get the v1.4 branch stable.
   - Rinse, repeat.

- Once the feature series meets certain criteria (e.g., feature
   complete, timeline is met, etc.), it undergoes a period of intense
   testing and debugging to achieve "super stable" status. Once "super
   stable" has been reached, the branch is renamed to be "v1.5" and we
   start the whole cycle again (with v1.6/v1.7).
   - CMRs and Gatekeepers are used on the super stable series.
   - The older super stable series (v1.3) then becomes either
     unsupported or "less supported."

***KEY POINT*** That the schedule of moving features and bug fixes to
the release branch is somewhat fluid. If George doesn't have time to
move feature X in his appointed week, the RMs shuffle him back further
in the schedule and take the next item off the list. This shuffling
allows for rapid response to dynamic resource availability at each
organization.

***KEY POINT*** One of the goals of this proposal is to remove the
stigma of not getting into a given release -- because the feature
branch will have somewhat frequent releases (probably somewhere
between 1-3 months; see below). Hence, we want to try to avoid the
tendency of OMPI developers to pack in a million features right before
a release, fearing that if feature X is not included in this release,
it'll sit on the SVN trunk for a year before release.

To ground all of the above discussion in a concrete proposal:

   1. Ralph and I will be responsible for the v1.4 and v1.5 series.
   2. Immediately start treating the v1.3 series as "super stable,"
      (although the v1.3 series is grandfathered -- George and Brad
      are still the RMs of v1.3 and are not bound by this proposal).
   3. (more or less) Immediately create the v1.4 branch from the SVN
      trunk. Start working toward v1.4.0.
   4. Ralph and I will draw up a public list of desired features for
      the next "super stable" series -- v1.5. This will include what
      has already happened on the trunk (and will therefore be in
      v1.4.0).
   5. Ralph and I will also make up a public schedule of when each
      feature will move from the trunk to the v1.4 branch. As
      mentioned above, this schedule is meant to be a living document;
      we fully expect that scheduled items will move around as
      time/resources/features shift.
   6. We'll periodically release 1.4.x versions with clear delineations
      of what new features are available in each. A SWAG for release
      frequency will be a release every 1-3 months. It might be easier
      to say that our initial intent is to release no less than once a
      quarter; specific frequency will likely be determined on an
      case-by-case basis.
   7. Once all the v1.5 features are in the v1.4 branch (or if we run
      out of time, or ...), rename it to v1.5, conduct a concerted
      community effort to stabilize v1.5 to "super stable" status, and
      release it.
   8. Start the whole cycle again with v1.6/v1.7.

Ralph and I feel that this proposal is well-suited to the development
style of the Open MPI community. We'll describe this in detail and
answer any questions on tomorrow's teleconference.

-- 
Jeff Squyres
Cisco Systems