Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: George Bosilca (bosilca_at_[hidden])
Date: 2007-07-12 16:12:20

We have the ODLS framework which is supposed to launch local
processes. Can we use it in order to spawn the local daemons ? This
will solve the Windows problem, and will give us a more consistent


On Jul 12, 2007, at 4:02 PM, Ralph H Castain wrote:

> The commit has been made - it is r15390.
> This commit restored the ability to execute singletons and singleton
> comm_spawn, both in single node and multi-node environments. It also
> includes a first step in our plan to reduce the ORTE system to the
> minimum
> functionality required to support Open MPI (more on that separately).
> Short description of major changes:
> 1. singletons now fork/exec a local daemon to manage their
> operations. This
> was required not only to resolve the current problem, but also to
> deal with
> threading issues in the progress engine down the road.
> 2. the orte daemon code now resides in libopen-rte. This was needed
> so that
> mpirun could fully provide all daemon services since we no longer
> allow
> multiple daemons to share a node (so an orted could not co-reside with
> mpirun).
> 3. daemons no longer use the orte triggering system during startup.
> Instead,
> they directly call back to their parent pls component to report
> ready to
> operate.
> I have modified all the pls components except xcpu and poe (don't
> understand
> either well enough to do it). Full functionality has been verified
> for rsh,
> SLURM, and TM systems. Compile has been verified for xgrid and
> gridengine,
> and hopefully those environments will work - though I could not
> verify that
> was true.
> Note that singletons will *not* operate in Windows environments at
> this
> time. The ability to fork/exec the local daemon would need to be added
> first, assuming Windows can support singletons (I honestly don't
> know).
> Please let me know of any problems.
> Ralph
> On 7/12/07 1:45 PM, "Ralph H Castain" <rhc_at_[hidden]> wrote:
>> Yo folks
>> Several of us are stuck waiting for this commit to hit. Rather
>> than wasting
>> the next several hours, I'm going to make the commit now.
>> So please be advised: if you do an update after this commit hits,
>> you will
>> need to autogen. You may want to wait until a convenient time
>> before doing
>> the update.
>> Thanks
>> Ralph
>> On 7/12/07 7:53 AM, "Ralph H Castain" <rhc_at_[hidden]> wrote:
>>> Yo all
>>> I have a fairly significant change coming to the orte part of the
>>> code base
>>> that will require an autogen (sorry). I'll check it in late this
>>> afternoon
>>> (can't do it at night as it is on my office desktop).
>>> The commit will fix the singleton operations, including singleton
>>> comm_spawn. It also takes the first step towards removing event-
>>> driven
>>> operations, replacing them with more serial code (to be explained
>>> separately). As part of all this, I had to modify the various pls
>>> components. For those I could not compile, I made a first cut at
>>> them that
>>> should (hopefully) allow them to continue to operate.
>>> Any of you using TM: we discovered that the trunk is not working
>>> currently
>>> on that environment. We are investigating - it has nothing to do
>>> with this
>>> commit, but predates it.
>>> Just wanted to give you a heads-up. Please refrain from making
>>> changes to
>>> the orte codebase today, if you could - it would simplify the
>>> commit and
>>> ensure we don't lose your changes.
>>> Thanks
>>> Ralph
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
> _______________________________________________
> devel-core mailing list
> devel-core_at_[hidden]