Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Scheduled merge of ORTE devel branch to trunk
From: Doug Tody (dtody_at_[hidden])
Date: 2008-02-13 12:32:30


Hi Ralph -

Eliminating the dependence of OMPI on the GPR is in some ways
actually a plus, as it should make it much easier to enhance the GPR
as an optional advanced capability. In general, it would be great
if OMPI/ORTE could make it easier to support this sort of extension
mechanism, for example by evolving the framework mechanism to a general
plugin mechanism supporting dynamic components as well as statically
compiled in ones. Probably this is what you meant by dynamic binary
modules below.

> That said, it would be relatively simple to add an extension that provided a
> level of data storage that user-level programs could access. It would not
> provide any subscription or trigger capabilities, however - we need to leave
> those out of the system to avoid reintroducing the event-driven problems
> again. But if you just wanted to store and retrieve data for sharing it
> across processes, that could be provided with minimal effort or impact.

Yes, this is what I had in mind. I do not understand the problem with
event-driven capabilities however; so long as these are only used in
some applications and not used for OMPI they should not compromise
OMPI. Even given a storage-only GPR, it should be possible for an
application to use the RML to accomplish much the same thing. Also,
whether there are problems (such as deadlock) with asynchronous,
event driven interactions is largely an issue of the interaction
patterns employed, and can be managed by careful design of the higher
level applications and their interactions.

> Another alternative: there is a separate "ORTE" project in Europe that is
> building extensions to our ORTE - they are tracking these code changes,

Sounds interesting - how would one find out more about this?

        - Doug

On Tue, 12 Feb 2008, Ralph Castain wrote:

> Hi Doug
>
> The changes are rather far-reaching. We essentially revamped the entire RTE
> to switch from an event-driven architecture to one based on sequential
> logic. This had large benefits, but the GPR was the casualty. Remember, the
> aim for the past year has been to create a dedicated "lean, mean OMPI
> machine"!
>
> That said, it would be relatively simple to add an extension that provided a
> level of data storage that user-level programs could access. It would not
> provide any subscription or trigger capabilities, however - we need to leave
> those out of the system to avoid reintroducing the event-driven problems
> again. But if you just wanted to store and retrieve data for sharing it
> across processes, that could be provided with minimal effort or impact.
> Probably best done as a compile-time optional module, though, to avoid
> adding to the memory footprint for everyone.
>
> Another alternative: there is a separate "ORTE" project in Europe that is
> building extensions to our ORTE - they are tracking these code changes, but
> adding "bolt-ons" such as a GPR-like central data store, hooks for workflow
> management and the grid, multi-cluster operations, etc. I'm working with
> them on those efforts - if there is interest in such capabilities, I can
> probably look into architecting things so that some of the "bolt-ons" could
> be dynamically picked up by OMPI as binary modules or something.
>
> For now, though, there will be no GPR-like storage in the new system.
> Ralph
>
>
>
> On 2/12/08 1:43 PM, "Doug Tody" <dtody_at_[hidden]> wrote:
>
> > Hi Ralph -
> >
> > How extensive are the changes involved in removing the GPR? How hard would
> > it be for someone to maintain an enhanced version of this as an addon or
> > compile-time optional module? Thanks.
> >
> > - Doug
> >
> >
> > On Mon, 11 Feb 2008, Ralph Castain wrote:
> >
> >> Hello all
> >>
> >> Per last week's telecon, we planned the merge of the latest ORTE devel
> >> branch to the OMPI trunk for after Sun had committed its C++ changes. That
> >> happened over the weekend.
> >>
> >> Therefore, based on the requests at the telecon, I will be merging the
> >> current ORTE devel branch to the trunk on Wed 2/13. I'll make the commit
> >> around 4:30pm Eastern time - will send out warning shortly before the commit
> >> to let you know it is coming. I'll advise of any delays.
> >>
> >> This will be a snapshot of that devel branch - it will include the upgraded
> >> launch system, remove the GPR, add the new tool communication library, allow
> >> arbitrary mpiruns to interconnect, supports the revamped hostfile and
> >> dash-host behaviors per the wiki, etc.
> >>
> >> However, it is incomplete and contains some known flaws. For example,
> >> totalview support has not been enabled yet. Comm_spawn, which is currently
> >> broken on the OMPI trunk, is fixed - but singleton comm_spawn remains
> >> broken. I am in the process of establishing support for direct and
> >> standalone launch capabilities, but those won't be in the merge. I have
> >> updated all of the launchers, but can only certify the SLURM, TM, and RSH
> >> ones to work - the Xgrid launcher is known to not compile, so if you have
> >> Xgrid on your Mac, you need to tell the build system to not build that
> >> component.
> >>
> >> This will give you a chance to look over the new arch, though, and I
> >> understand that people would like to begin having a chance to test and
> >> review the revised code. Hopefully, you will find most of the bugs to be
> >> minor.
> >>
> >> Please advise of any concerns about this merge. The schedule is totally
> >> driven by the requests of the MPI team members (delaying the merge has no
> >> impact on ORTE development), so requests to shift the schedule should be
> >> discussed amongst the community.
> >>
> >> Thanks
> >> Ralph
> >>
> >>
> >> _______________________________________________
> >> devel mailing list
> >> devel_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >>
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>