Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Scheduled merge of ORTE devel branch to trunk
From: Ralph Castain (rhc_at_[hidden])
Date: 2008-02-12 22:36:43


Hi Doug

The changes are rather far-reaching. We essentially revamped the entire RTE
to switch from an event-driven architecture to one based on sequential
logic. This had large benefits, but the GPR was the casualty. Remember, the
aim for the past year has been to create a dedicated "lean, mean OMPI
machine"!

That said, it would be relatively simple to add an extension that provided a
level of data storage that user-level programs could access. It would not
provide any subscription or trigger capabilities, however - we need to leave
those out of the system to avoid reintroducing the event-driven problems
again. But if you just wanted to store and retrieve data for sharing it
across processes, that could be provided with minimal effort or impact.
Probably best done as a compile-time optional module, though, to avoid
adding to the memory footprint for everyone.

Another alternative: there is a separate "ORTE" project in Europe that is
building extensions to our ORTE - they are tracking these code changes, but
adding "bolt-ons" such as a GPR-like central data store, hooks for workflow
management and the grid, multi-cluster operations, etc. I'm working with
them on those efforts - if there is interest in such capabilities, I can
probably look into architecting things so that some of the "bolt-ons" could
be dynamically picked up by OMPI as binary modules or something.

For now, though, there will be no GPR-like storage in the new system.
Ralph

On 2/12/08 1:43 PM, "Doug Tody" <dtody_at_[hidden]> wrote:

> Hi Ralph -
>
> How extensive are the changes involved in removing the GPR? How hard would
> it be for someone to maintain an enhanced version of this as an addon or
> compile-time optional module? Thanks.
>
> - Doug
>
>
> On Mon, 11 Feb 2008, Ralph Castain wrote:
>
>> Hello all
>>
>> Per last week's telecon, we planned the merge of the latest ORTE devel
>> branch to the OMPI trunk for after Sun had committed its C++ changes. That
>> happened over the weekend.
>>
>> Therefore, based on the requests at the telecon, I will be merging the
>> current ORTE devel branch to the trunk on Wed 2/13. I'll make the commit
>> around 4:30pm Eastern time - will send out warning shortly before the commit
>> to let you know it is coming. I'll advise of any delays.
>>
>> This will be a snapshot of that devel branch - it will include the upgraded
>> launch system, remove the GPR, add the new tool communication library, allow
>> arbitrary mpiruns to interconnect, supports the revamped hostfile and
>> dash-host behaviors per the wiki, etc.
>>
>> However, it is incomplete and contains some known flaws. For example,
>> totalview support has not been enabled yet. Comm_spawn, which is currently
>> broken on the OMPI trunk, is fixed - but singleton comm_spawn remains
>> broken. I am in the process of establishing support for direct and
>> standalone launch capabilities, but those won't be in the merge. I have
>> updated all of the launchers, but can only certify the SLURM, TM, and RSH
>> ones to work - the Xgrid launcher is known to not compile, so if you have
>> Xgrid on your Mac, you need to tell the build system to not build that
>> component.
>>
>> This will give you a chance to look over the new arch, though, and I
>> understand that people would like to begin having a chance to test and
>> review the revised code. Hopefully, you will find most of the bugs to be
>> minor.
>>
>> Please advise of any concerns about this merge. The schedule is totally
>> driven by the requests of the MPI team members (delaying the merge has no
>> impact on ORTE development), so requests to shift the schedule should be
>> discussed amongst the community.
>>
>> Thanks
>> Ralph
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel