Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] RFC: Pineapple Runtime Interposition Project
From: Josh Hursey (jjhursey_at_[hidden])
Date: 2012-06-15 14:55:46


What: A Runtime Interposition Project - Codename Pineapple

Why: Define clear API and semantics for runtime requirements of the OMPI layer.

When:
 - F June 22, 2012 - Work completed
 - T June 26, 2012 - Discuss on teleconf
 - R June 28, 2012 - Commit to trunk

Where: Trunk (development BitBucket branch below)
  https://bitbucket.org/jjhursey/ompi-pineapple

Attached:
  PDF of slides presented on the June 12, 2012 teleconf. Note that the
timeline was slightly adjusted above (work completed date moved
ealier).

Description: Short Version
--------------------------
Define, in an 'rte.h', the interfaces and semantics that the OMPI
layer requires of a runtime environment. Currently this interface
matches the subset of ORTE functionality that is used by the OMPI
layer. Runtime symbols (e.g., orte_ess.proc_get_locality) are isolated
to a framework inside this project to provide linker-level protection
against accidental breakage of the pineapple interposition layer.

The interposition project provides researchers working on side
projects above and below the 'rte.h' interface a single location in
the code base to watch for interface and semantic changes that they
need to be concerned about. Researchers working above the pineapple
layer might explore something other than (or in addition to) OMPI
(e.g., Extended OMPI, UPC+OMPI). Researchers working below the
pineapple layer might explore something other than (or in addition to)
ORTE under OMPI (e.g., specialized runtimes for specific
environments).

Description: Other notes
------------------------
The pineapple interface provides OMPI developers with a runtime API to
program against without requiring detailed knowledge of the layout of
ORTE and its frameworks. In some places in OMPI a single source file
needs to include >5 (up to 12 in one place) different header files to
get all of the necessary symbols. Developers must not only know where
these headers are, but must also understand the differences between
the various frameworks in ORTE to use ORTE. The developer must also be
aware that there are certain APIs and data structure fields that are
not available to the MPI process, so should not be used. The pineapple
project provides an API representing the small subset of ORTE that is
used by OMPI. With this API a developer only needs to look at a single
location in the code base to understand what is provided by the
runtime for use in the OMPI layer.

A similar statement could be made for runtime developers trying to
figure out what the OMPI layer requires from the a runtime
environment. Currently they need a deep understanding of the behavior
of ORTE to understand the semantics of various calls to ORTE from the
OMPI layer. Then they must develop a custom patch for the OMPI layer
that extracts the ORTE symbols, and replaces them with their own
symbols. This process is messy, error prone, and tedious to say the
least. Having a single set of interfaces and semantics will allow such
developers to focus their efforts on supporting the Open MPI community
defined API, and not necessarily the evolution of the ORTE or OMPI
project internals. This is advantageous when porting Open MPI to an
environment with a full featured runtime already running on the
machine, and for researchers exploring radical runtime designs for
future systems. The pineapple API allows such projects to develop
beside the mainline Open MPI trunk a little more easily than without
the pineapple API.

FAQ:

----
(1) Why is this a separate project and not a framework of OMPI? or a
framework of ORTE?
After much deliberation between the developers, from a software
engineering perspective, making the pineapple rte.h interface a
separate project was the most flexible solution. So neither the OMPI
layer nor the ORTE layer 'own' the interface, but it is 'owned' by the
Open MPI project primarily to support the interaction between these
two layers.
Consider that if we decided to place the interface in the OMPI layer
as a framework then we would be able to place something other than (or
in addition to) ORTE underneath OMPI, but we would be limited in our
ability to place something other than (or in addition to) OMPI over
ORTE. Alternatively, if we decided to place the rte.h interface in the
ORTE layer then we would be able to place something other than (or in
addition to) OMPI over ORTE, but we would be limited in our ability to
place something other than (or in addition to) ORTE under OMPI.
Defining the interposition layer as a separate project between these
two layers allows maximal flexibility for the project and researchers
working on side branches.
(2) What if another project outside of Open MPI needs interface
changes to the pineapple 'rte.h'?
The rule of thumb is that 'The OMPI/ORTE/OPAL stack is king!'. This
means that the pineapple project should always err on the side of
supporting the OMPI/ORTE/OPAL stack, as that is the flagship product
of the Open MPI project. Interface suggestions are always welcome and
the rte.h may be extended/modified in the future as a result of those
suggestions. However, if a suggested change negatively impacts the
OMPI/ORTE/OPAL stack then it is unlikely to be accepted upstream by
the Open MPI community.
-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey