Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Trunk returned to normal operations
From: Ralph H Castain (rhc_at_[hidden])
Date: 2008-02-28 08:51:59


On 2/27/08 11:23 PM, "George Bosilca" <bosilca_at_[hidden]> wrote:

> Ralph,
>
> I have 2 questions related to the merge.
> 1. How about orte_run_debugger ? Looks like this function is now never
> called. Should we remove it ?

Not yet. Jeff is working on a new debugger integration on a tmp branch. We
will merge that into the trunk when complete - we have to do some
modifications to reflect the revamped architecture.

NOTE: This means that totalview is -not- currently supported on the trunk.

> 2. There was a nice feature in the old ORTE that I particularly liked.
> The ability to specify a default hostfile in the mca-params.conf file
> (rds_hostfile_path). As there is no rds anymore do we still have this
> or some similar feature or do we have to always specify the
> machinefile via the command line option ?

Hmmm...well, I'm afraid we don't now, though there is no reason we couldn't
do it. The revised specification for hostfile operations changed the scope
of "hostfile" to apply at the app_context level. So if we have a hostfile
mca param, the issue becomes "which app_context does that apply to?".

I honestly have no opinion on this - if you would like to raise the
discussion with the community, I can implement whatever people decide. I can
see one option might be to say that any mca-specified hostfile would provide
a "global" pool of nodes, with any cmd-line specs then indicating which of
those hosts are to be used for that app_context (and no cmd-line spec
meaning draw from the "global" pool).

I'll leave that for you folks to decide...just let me know what you want the
name to be, and how you want it to behave.

Thanks
Ralph

>
> Thanks,
> george.
>
> On Feb 27, 2008, at 9:47 PM, Ralph Castain wrote:
>
>> Hi folks
>>
>> Okay, the ORTE merge appears to have gone well and is now complete -
>> you are
>> free to use the trunk.
>>
>> A few caveats:
>>
>> 1. obviously, you will need to autogen/configure once you update. I
>> -strongly- recommend you rm -rf your install directory first as you
>> will
>> definitely be hit with stale libraries from this commit
>>
>> 2. this is a "drop" from the ORTE devel effort. As such, it is -not-
>> complete. There are several known issues, particularly with
>> comm_spawn and
>> singleton comm_spawn in certain environments and scenarios. I have a
>> "fix"
>> already done and ready to be applied for the comm_spawn problems,
>> but I want
>> to test it some more in the morning before committing it to the
>> trunk - and
>> I didn't want to delay this merge any longer.
>>
>> 3. we know that checkpoint/restart is currently broken. Josh and I
>> have
>> discussed a couple of options for repairing it, and he will look at
>> it as
>> soon as he has a chance. It isn't a big problem - just need to
>> decide which
>> option he would prefer to pursue.
>>
>> The remaining ORTE scalability work should be moving into the trunk
>> over the
>> next few weeks (I will be on vacation 3/7-14, so it will likely take
>> through
>> March). We do not anticipate any API changes or framework adds/
>> deletes the
>> rest of the way - there will be a few new components added to existing
>> frameworks, some revamp of the logic in a few places, etc.
>>
>> I will try to cover all the changes in one or two notes over the
>> next few
>> days to avoid carpal tunnel. Please feel free to ask questions and
>> I'll do
>> my best to provide answers.
>>
>> Thanks again for the cooperation tonight...
>> Ralph
>>
>>
>>
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel