Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [OMPI svn] svn:open-mpi r18115
From: Josh Hursey (jjhursey_at_[hidden])
Date: 2008-04-10 10:35:32


Thanks for the fix. I can confirm that the trunk is working again with
the unity component.

I agree that I should support the 'tree' component, but I probably
won't be able to get to it for another couple of weeks.

Thanks again,
Josh

On Apr 9, 2008, at 10:51 PM, Ralph Castain wrote:

> Okay, the irony here is truly humorous. This took several hours to
> chase
> down.
>
> As you may recall, we had an earlier problem with the unity routed
> module
> where I gave you a couple of options for repairing it. Well, it
> turned out
> that the latest changes obviated the need for that hack...and so the
> hack
> caused the system to fail.
>
> So, having now removed the prior hack required to keep the module
> alive, you
> should find it happy again!
>
> BTW: it isn't that the unity module is such a pain in itself. The
> problem
> lies in our efforts to shift data movement to the daemon level for
> scalability, versus the inherent "everything happens directly
> between the
> apps" approach of the unity module. As we move more and more things
> to the
> daemon level, we are achieving the scalability we want - it just
> makes it
> harder to find a way to blend the conflicting approach in unity so
> it can
> keep running.
>
> I believe we have now reached a point, though, where it may now be
> easier to
> keep that module alive. Everything we need to shift to the daemons
> has now
> been shifted, so I don't believe unity is going to present as much
> of a
> problem going forward.
>
> I still think it would be good for you to get C/R to work with non-
> unity
> routed modules for scalability reasons - unity is still inherently
> non-scalable. But hopefully it won't be as much of a roller-coaster
> for you
> as we go forward.
>
> Thanks for the patience
> Ralph
>
>
> On 4/9/08 5:15 PM, "Ralph Castain" <rhc_at_[hidden]> wrote:
>
>> Groan...yes, will look at it this evening and get it fixed as
>> quickly as I
>> can.
>>
>> Sorry...like I said, unity is getting harder and harder to keep
>> alive. :-/
>>
>> Ralph
>>
>>
>> On 4/9/08 5:01 PM, "Josh Hursey" <jjhursey_at_[hidden]> wrote:
>>
>>> Ralph,
>>>
>>> It seems that the 'unity' component of the routed framework is
>>> broken
>>> as a result of this commit. :(
>>>
>>> Any chance you can take a look at this?
>>>
>>> Thanks,
>>> Josh
>>>
>>> On Apr 9, 2008, at 6:10 PM, rhc_at_[hidden] wrote:
>>>> Author: rhc
>>>> Date: 2008-04-09 18:10:53 EDT (Wed, 09 Apr 2008)
>>>> New Revision: 18115
>>>> URL: https://svn.open-mpi.org/trac/ompi/changeset/18115
>>>>
>>>> Log:
>>>> Fully implement the inbound binomial allgather for daemon-based
>>>> collectives. Supports both modex and barrier operations.
>>>>
>>>> Comm_spawn still uses the rank=0 method - shifting that algo to the
>>>> daemons is under study.
>>>>
>>>>
>>>> Removed:
>>>> trunk/orte/mca/grpcomm/base/grpcomm_base_barrier.c
>>>> trunk/orte/mca/grpcomm/exp/
>>>> Text files modified:
>>>> trunk/ompi/mca/pml/ob1/pml_ob1.c
>>>> | 1
>>>> trunk/orte/mca/ess/hnp/ess_hnp_module.c
>>>> | 2
>>>> trunk/orte/mca/grpcomm/base/Makefile.am
>>>> | 1
>>>> trunk/orte/mca/grpcomm/base/base.h
>>>> | 3
>>>> trunk/orte/mca/grpcomm/base/grpcomm_base_allgather.c |
>>>> 253 -----------
>>>> trunk/orte/mca/grpcomm/basic/grpcomm_basic_component.c
>>>> | 4
>>>> trunk/orte/mca/grpcomm/basic/grpcomm_basic_module.c |
>>>> 832 ++++++++++++++++++++++++++++++++++-----
>>>> trunk/orte/mca/grpcomm/cnos/grpcomm_cnos_module.c
>>>> | 8
>>>> trunk/orte/mca/grpcomm/grpcomm.h |
>>>> 27 +
>>>> trunk/orte/mca/grpcomm/grpcomm_types.h
>>>> | 8
>>>> trunk/orte/mca/odls/base/odls_base_close.c
>>>> | 1
>>>> trunk/orte/mca/odls/base/odls_base_default_fns.c |
>>>> 131 ++++-
>>>> trunk/orte/mca/odls/base/odls_base_open.c |
>>>> 24 +
>>>> trunk/orte/mca/odls/base/odls_private.h
>>>> | 16
>>>> trunk/orte/mca/plm/base/plm_base_launch_support.c
>>>> | 7
>>>> trunk/orte/mca/rmaps/base/rmaps_base_map_job.c
>>>> | 1
>>>> trunk/orte/mca/rmaps/base/rmaps_base_open.c
>>>> | 4
>>>> trunk/orte/mca/rmaps/base/rmaps_base_support_fns.c |
>>>> 186 +-------
>>>> trunk/orte/mca/rmaps/base/rmaps_private.h
>>>> | 2
>>>> trunk/orte/mca/rmaps/rank_file/rmaps_rank_file.c
>>>> | 2
>>>> trunk/orte/mca/rmaps/rmaps_types.h |
>>>> 28 +
>>>> trunk/orte/mca/rmaps/round_robin/rmaps_rr.c
>>>> | 8
>>>> trunk/orte/mca/rmaps/seq/rmaps_seq.c
>>>> | 2
>>>> trunk/orte/mca/rml/rml_types.h
>>>> | 36
>>>> trunk/orte/orted/orted_comm.c |
>>>> 43 +-
>>>> trunk/orte/runtime/data_type_support/orte_dt_copy_fns.c
>>>> | 2
>>>> trunk/orte/runtime/data_type_support/orte_dt_packing_fns.c
>>>> | 4
>>>> trunk/orte/runtime/data_type_support/orte_dt_print_fns.c
>>>> | 4
>>>> trunk/orte/runtime/data_type_support/orte_dt_unpacking_fns.c
>>>> | 4
>>>> trunk/orte/runtime/orte_globals.c
>>>> | 3
>>>> trunk/orte/runtime/orte_globals.h
>>>> | 1
>>>> trunk/orte/runtime/orte_globals_class_instances.h
>>>> | 2
>>>> 32 files changed, 1019 insertions(+), 631 deletions(-)
>>>>
>>>>
>>>> Diff not shown due to size (106446 bytes).
>>>> To see the diff, run the following command:
>>>>
>>>> svn diff -r 18114:18115 --no-diff-deleted
>>>>
>>>> _______________________________________________
>>>> svn mailing list
>>>> svn_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/svn
>>>
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>