Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [PATCH 2/4] Trying to get the C/R code to compile again. (send_*_nb)
From: Josh Hursey (jjhursey_at_[hidden])
Date: 2013-12-06 09:30:15


Since the blocking semantics are important for correctness of the prior
code, I would not just replace send_buffer with send_buffer_nb. This makes
the semantics incorrect, and will make things confusing later when you try
to sort out prior calls to send_buffer_nb with those that you replaced.

As an alternative I would suggest that you "#ifdef 0" out those sections of
code and leave the send_buffer_nb alternative in a comment. Then leave a
big TODO comment there for you to go back and fix the semantics - which
will likely involve just rewriting large sections of that framework. But at
least you will be able to see what was there before when you try to move it
to a more nonblocking model.

The bkmrk component is subtle, maybe more that it should be. So keeping the
old blocking interfaces there will probably help quite a bit when you get
to it later. In that component the blocking calls are critical to
correctness, so we will need to sort out how to make that more asynchronous
in our redesign.

Other than that modification (#ifdef comments instead of nonblocking
replacements), I think this patch is fine. As was mentioned previously, we
will need to go back (after things compile) and figure out a new model for
this behavior.

Thanks!
Josh

On Wed, Dec 4, 2013 at 9:58 AM, Jeff Squyres (jsquyres)
<jsquyres_at_[hidden]>wrote:

> Err... upon further thought, I might be totally wrong about emulating
> blocking. There might be (probably are?) rules/assumptions in the ORTE
> layer (of which I am *not* an expert) that disallow you from [emulating]
> blocking.
>
> If that's the case, then there's architectural issues with converting from
> blocking to nonblocking on both the sending and the receiving sides that
> might be a bit thorny to sort out.
>
>
>
> On Dec 4, 2013, at 10:54 AM, "Jeff Squyres (jsquyres)" <jsquyres_at_[hidden]>
> wrote:
>
> > On Nov 25, 2013, at 9:59 AM, Adrian Reber <adrian_at_[hidden]> wrote:
> >
> >> * Send Non-blocking
> >> */
> >> int orte_rml_ftrm_send_nb(orte_process_name_t* peer,
> >> struct iovec* msg,
> >> int count,
> >> orte_rml_tag_t tag,
> >> - int flags,
> >> orte_rml_callback_fn_t cbfunc,
> >> void* cbdata)
> >> {
> >> int ret;
> >>
> >> opal_output_verbose(20, rml_ftrm_output_handle,
> >> - "orte_rml_ftrm: send_nb(%s, %d, %d, %d )",
> >> - ORTE_NAME_PRINT(peer), count, tag, flags);
> >> + "orte_rml_ftrm: send_nb(%s, %d, %d )",
> >> + ORTE_NAME_PRINT(peer), count, tag);
> >>
> >> if( NULL != orte_rml_ftrm_wrapped_module.send_nb ) {
> >> - if( ORTE_SUCCESS != (ret =
> orte_rml_ftrm_wrapped_module.send_nb(peer, msg, count, tag, flags, cbfunc,
> cbdata) ) ) {
> >> - return ret;
> >> - }
> >> - }
> >> -
> >> - return ORTE_SUCCESS;
> >> -}
> >> -
> >> -/*
> >> - * Send Buffer
> >> - */
> >> -int orte_rml_ftrm_send_buffer(orte_process_name_t* peer,
> >> - opal_buffer_t* buffer,
> >> - orte_rml_tag_t tag,
> >> - int flags)
> >> -{
> >> - int ret;
> >> -
> >> - opal_output_verbose(20, rml_ftrm_output_handle,
> >> - "orte_rml_ftrm: send_buffer(%s, %d, %d )",
> >> - ORTE_NAME_PRINT(peer), tag, flags);
> >> -
> >> - if( NULL != orte_rml_ftrm_wrapped_module.send_buffer ) {
> >> - if( ORTE_SUCCESS != (ret =
> orte_rml_ftrm_wrapped_module.send_buffer(peer, buffer, tag, flags) ) ) {
> >> + if( ORTE_SUCCESS != (ret =
> orte_rml_ftrm_wrapped_module.send_nb(peer, msg, count, tag, cbfunc, cbdata)
> ) ) {
> >> return ret;
> >> }
> >> }
> >
> > Similar to my reply about patch 3, I don't think this hunk is correct.
> >
> > This routine accepts an iovec and sends it in a non-blocking fashion.
> I'll bet that the caller frees the iovec upon return from the function
> (because it used to be a blocking send, and freeing it immediately was
> acceptable).
> >
> > But now the iovec may well still be in use when this function returns,
> so the caller should *not* free/reuse the iovec until it knows that the
> send has complete.
> >
> > It may be more desirable to keep the blocking send function
> orte_rml_ftrm_send_buffer() and emulate blocking by invoking send_nb under
> the covers, but then not returning until the send callback has actually
> been invoked.
> >
> > Then the blocking semantics expected by the caller may well be
> acceptable/safe.
> >
> > This loses some potential optimizations of asynchronicity, but it may be
> worth it: a) performance in this part of the code isn't too critical, and
> b) blocking semantics are usually simpler and easier to maintain, from the
> caller's perspective.
> >
> > This idea may also apply to what I said in reply to patch 3...? (i.e.,
> preserve a blocking send by using the _nb variant under the covers, but not
> returning until the nonblocking variant has actually completed the receive).
> >
> > Since this is a fairly large change, I didn't look too closely
> throughout the rest of this patch. I assume that there are a few other
> architectural cases similar to this one.
> >
> > --
> > Jeff Squyres
> > jsquyres_at_[hidden]
> > For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> >
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

-- 
Joshua Hursey
Assistant Professor of Computer Science
University of Wisconsin-La Crosse
http://cs.uwlax.edu/~jjhursey