Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [OMPI svn] svn:open-mpi r19600
From: Richard Graham (rlgraham_at_[hidden])
Date: 2008-09-22 13:55:37


What Ken put in is what is needed for the limited multi-cluster capabilities
we need, just one additional string. I don't think there is a need for any
discussion of such a small change.

Rich

On 9/22/08 1:32 PM, "Ralph Castain" <rhc_at_[hidden]> wrote:

> We really should discuss that as a group first - there is quite a bit
> of code required to actually support multi-clusters that has been
> removed.
>
> Our operational model that was agreed to quite a while ago is that
> mpirun can -only- extend over a single "cell". You can connect/accept
> multiple mpiruns that are sitting on different cells, but you cannot
> execute a single mpirun across multiple cells.
>
> Please keep this on your own development branch for now. Bringing it
> into the trunk will require discussion as this changes the operating
> model, and has significant code consequences when we look at abnormal
> terminations, comm_spawn, etc.
>
> Thanks
> Ralph
>
> On Sep 22, 2008, at 11:26 AM, Richard Graham wrote:
>
>> This check in was in error - I had not realized that the checkout
>> was from
>> the 1.3 branch, so we will fix this, and put these into the trunk
>> (1.4). We
>> are going to bring in some limited multi-cluster support - limited
>> is the
>> operative word.
>>
>> Rich
>>
>>
>> On 9/22/08 12:50 PM, "Jeff Squyres" <jsquyres_at_[hidden]> wrote:
>>
>>> I notice that Ken Matney (the committer) is not on the devel list; I
>>> added him explicitly to the CC line.
>>>
>>> Ken: please see below.
>>>
>>>
>>> On Sep 22, 2008, at 12:46 PM, Ralph Castain wrote:
>>>
>>>> Whoa! We made a decision NOT to support multi-cluster apps in OMPI
>>>> over a year ago!
>>>>
>>>> Please remove this from 1.3 - we should discuss if/when this would
>>>> even be allowed in the trunk.
>>>>
>>>> Thanks
>>>> Ralph
>>>>
>>>> On Sep 22, 2008, at 10:35 AM, matney_at_[hidden] wrote:
>>>>
>>>>> Author: matney
>>>>> Date: 2008-09-22 12:35:54 EDT (Mon, 22 Sep 2008)
>>>>> New Revision: 19600
>>>>> URL: https://svn.open-mpi.org/trac/ompi/changeset/19600
>>>>>
>>>>> Log:
>>>>> Added member to orte_node_t to enable multi-cluster jobs in ALPS
>>>>> scheduled systems (like Cray XT).
>>>>>
>>>>> Text files modified:
>>>>> branches/v1.3/orte/runtime/orte_globals.h | 4 ++++
>>>>> 1 files changed, 4 insertions(+), 0 deletions(-)
>>>>>
>>>>> Modified: branches/v1.3/orte/runtime/orte_globals.h
>>>>> =
>>>>> =
>>>>> =
>>>>> =
>>>>> =
>>>>> =
>>>>> =
>>>>> =
>>>>> =
>>>>> =
>>>>> =
>>>>> ===================================================================
>>>>> --- branches/v1.3/orte/runtime/orte_globals.h (original)
>>>>> +++ branches/v1.3/orte/runtime/orte_globals.h 2008-09-22 12:35:54
>>>>> EDT (Mon, 22 Sep 2008)
>>>>> @@ -222,6 +222,10 @@
>>>>> /** Username on this node, if specified */
>>>>> char *username;
>>>>> char *slot_list;
>>>>> + /** Clustername (machine name of cluster) on which this node
>>>>> + resides. ALPS scheduled systems need this to enable
>>>>> + multi-cluster support. */
>>>>> + char *clustername;
>>>>> } orte_node_t;
>>>>> ORTE_DECLSPEC OBJ_CLASS_DECLARATION(orte_node_t);
>>>>>
>>>>> _______________________________________________
>>>>> svn mailing list
>>>>> svn_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/svn
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel