Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Howto pause BTL's sending at runtime
From: Christoph Konersmann (c_k_at_[hidden])
Date: 2010-01-08 04:14:15


Sorry, but the mailscanner somehow doesn't like the sourcecode...
Changed now.

> Hi again,
>
> Maybe I should give more specific information with some code snippets...
>
> Currently I added
> #define ORTE_DAEMON_BTL_CTL_CMD (orte_daemon_cmd_flag_t) 26
> to odls_types.h to identify if I want to trigger the BTL pause.
>
> In process_commands() of orted/orted_comm.c this flag is processed first
> by broadcasting to all orteds with xcast of the grpcomm framework. At
> second it's forwarded with orte_odls.deliver_message to the local procs.
> So every process should get the trigger. Or is there another possibly
> easier way of spawning the trigger?
>
> I expanded the mca_btl_base_module_t in btl/btl.h simply with an
> indicator (btl_paused) if pause is set.
>
> I then added a line to the initial values in every BTL component that
> btl_paused should be false by default. E.g. in self/btl_self.c.
> Or did I forget something?
>
> So my problem is now, when every process gets the trigger in the ORTE
> project, how could I set btl->paused to true in OMPI project? ORTE has
> not (and I know it should not) have access to the OMPI components. Is
> there a way of implementing a libevent callback function in the BTL
> modules? Or is there another way? I already read the documentation at
> your wiki-site, but for me it's not really trivial as I'm relatively new
> to this.
>
> An idea to get the connection to the OMPI project would be to use the
> ft_event framework. Therefore I added another opal_crs_state_type_t
> OPAL_CRS_PAUSE in crs/crs.h and tried to trigger the event in
> orted_comm.c with:

> if( NULL != orte_ess.ft_event ) {
> if( ORTE_SUCCESS != (ret = orte_ess.ft_event(OPAL_CRS_PAUSE))) {
> goto CLEANUP;
> }
> }

> But the ft_event() is NULL and therefore isn't executed...
>
> Any ideas? Any advices?
>
> For me the performance impact of a solution is of no interest.
>
> Thanks, and please excuse me if I bother you with this.
>
> Christoph
>