There has been some discussion on this list about connecting multiple
applications so that their processes can inter-communicate and/or
inter-operate. I thought it might be of help if I briefly described the
mechanisms within Open MPI for doing this - in doing so, I will not
limit myself to the MPI interface, but will discuss the more general
capabilities of the Open MPI/OpenRTE system.
Note: I apologize if I confuse people by using "OpenRTE" explicitly
here. However, the OpenRTE layer underneath Open MPI is where this
functionality is actually implemented, and - in one option below - it
provides some functionality that lies outside the MPI standard but
might be of interest to the community.
Let me start by describing the design for Open MPI/OpenRTE, and then
I'll briefly explain what part of that is in the current release.
There are three ways for interconnecting applications with Open
1. Dynamic process spawning using the MPI "comm_spawn" methodology.
Open MPI includes support for dynamic process spawning as defined in
the MPI standard. For the purposes of this discussion, note that this
requires the application developer to incorporate the path name of the
application to be spawned into the source code. Thus, connecting to
another application - or another (differently named) version of the
application - requires a change to the source code.
2. Run-time connection using the "connect" command-line option. The
benefit of this approach is that you don't have to directly code the
applications that are being connected - i.e., you don't have to put
anything in your source code that stipulates the precise application to
which you want to be connected. Instead, this option simply connects
all the processes from one instance of "mpirun" to another. Of course,
your code still has to know what to do with the communication channel.
As a (probably bizarre) example, consider the case where I have built a
model that uses a utility code to generate some information (e.g., a
mesh). The utility code does this whenever my model transmits a set of
parameters to all the processes in the utility code, and the utility
code communicates its output back via an MPI transmission. Using this
option for connecting applications, I can experiment with different
versions of my utility code by simply: (a) executing my model via an
mpirun command, and then (b) executing a version of the utility with
another mpirun, and connecting it to the model via the "connect'
option. No change to source code (e.g., to embed the name of the
alternative version) is required, nor do I have to recompile the model
- everything is handled at run-time.
3. Direct synchronization using the OpenRTE General Purpose Registry
(GPR). The prior two options only provide a means for exchanging
communication connection information between processes of different
applications. There are times, though, when multi-application
integration requires more - for example, two applications might need to
synchronize their computations so that one knows when the other has
completed its work. One such case could be in climate models, where an
atmospheric propagation model might want to "pause" until a sea-ice
model has completed the latest epoch calculation, and then use the
output of that model as input to its own computation of the next epoch.
This could be accomplished via option 1 (dynamic spawning), coupled
with the transmission of messages to coordinate action. However, we
have built another option into the system that makes this a little more
transparent and (perhaps) easier to accomplish. In addition, it
automatically supports asynchronous operations so that applications can
use event-driven logic to guide their operations.
This mode requires that the developer(s) of the applications do a
little planning as they must agree on the definition of registry
locations for synchronizing information (for more on the registry's
data representation scheme, see the OpenRTE design document at
http://www.open-rte.org/documentation/design.php). With that done, each
application would "subscribe" to the location where other applications
will be writing synchronizing flags. In the case of the climate model,
for example, the sea-ice model could write the time stamp for the
completed epoch in a location. The climate model would subscribe to
that location and request notification whenever the value in that
location is changed - the model might also stipulate that another
registry location containing the name of the sea-ice model's output
file for that epoch be returned to it whenever the subscription fires.
Clearly, given the exchange of communication ports in options 1 or 2,
an application developer could implement this degree of synchronization
themselves. The advantage of this approach, however, is that it
relieves the application programmer from having to implement such
protocols themselves in every application - the complexities of
identifying trigger events, notifying other applications, etc. is all
handled for them "under the covers".
My guess is that we'll find more ways to take advantage of this
capability as people begin to experiment with it.
The current release supports options 1 and 3 at this time, although the
interface for option 3 has not yet been exposed through the MPI layer.
Users desiring to use and/or experiment with direct synchronization
will need to configure Open MPI with the "--with-devel-headers" option
so that the OpenRTE include files will be copied to the install
directory tree. Once this has been done, you will have full access to
the GPR's API, which is described in the gpr.h header. Further
documentation on option 3 will be released in upcoming weeks.
I hope that helps answer some of your questions - and also that it
stimulates some thought on how you might explore these capabilities as
they are released!