Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r25431
From: Tim Mattox (timattox_at_[hidden])
Date: 2011-11-04 14:15:07


I doubt you want to be using assert() here or anywhere in the
OMPI library. If the rem_info.rem_qps pointer is NULL, I would
think you want to do something other than just die.
And, in a production build, if it was to ever be null, things would
eventually crash right? The ompi library should strive to never crash...
maybe print an error message and abort, but flat out crashing
seems wrong.

I've been away from active OMPI development for awhile now so I
might be remembering this incorrectly, but isn't there some
policy/SOP on not using assert in the OMPI library code?

Please excuse me while I return to being a silent/lurking ghost
of an OMPI developer.

On Thu, Nov 3, 2011 at 8:15 PM, <cyeoh_at_[hidden]> wrote:
> Author: cyeoh
> Date: 2011-11-03 20:15:08 EDT (Thu, 03 Nov 2011)
> New Revision: 25431
> URL: https://svn.open-mpi.org/trac/ompi/changeset/25431
>
> Log:
> Removes pointless memmove which because of a previous memcpy will always
> have identical source and destination pointers. See #2871
> Plugs a couple of minor memory leaks related to remote qp info
>
>
> Text files modified:
>   trunk/ompi/mca/btl/openib/btl_openib_endpoint.c            |     3 +++
>   trunk/ompi/mca/btl/openib/connect/btl_openib_connect_oob.c |    22 ++++++++++++----------
>   2 files changed, 15 insertions(+), 10 deletions(-)
>
> Modified: trunk/ompi/mca/btl/openib/btl_openib_endpoint.c
> ==============================================================================
> --- trunk/ompi/mca/btl/openib/btl_openib_endpoint.c     (original)
> +++ trunk/ompi/mca/btl/openib/btl_openib_endpoint.c     2011-11-03 20:15:08 EDT (Thu, 03 Nov 2011)
> @@ -452,6 +452,9 @@
>     free(endpoint->qps);
>     endpoint->qps = NULL;
>
> +    free(endpoint->rem_info.rem_qps);
> +    free(endpoint->rem_info.rem_srqs);
> +
>     /* unregister xrc recv qp */
>  #if HAVE_XRC
>     if (0 != endpoint->xrc_recv_qp_num) {
>
> Modified: trunk/ompi/mca/btl/openib/connect/btl_openib_connect_oob.c
> ==============================================================================
> --- trunk/ompi/mca/btl/openib/connect/btl_openib_connect_oob.c  (original)
> +++ trunk/ompi/mca/btl/openib/connect/btl_openib_connect_oob.c  2011-11-03 20:15:08 EDT (Thu, 03 Nov 2011)
> @@ -272,19 +272,14 @@
>  static int set_remote_info(mca_btl_base_endpoint_t* endpoint,
>                            mca_btl_openib_rem_info_t* rem_info)
>  {
> +    /* Free up the memory pointed to by rem_qps before overwriting the pointer
> +       in the following memcpy */
> +    free(endpoint->rem_info.rem_qps);
> +
>     /* copy the rem_info stuff */
>     memcpy(&((mca_btl_openib_endpoint_t*) endpoint)->rem_info,
>            rem_info, sizeof(mca_btl_openib_rem_info_t));
>
> -    /* copy over the rem qp info */
> -    /* per #2871, changed this from memcpy() to memmove() to handle
> -     * the case of overlapping (or same) src/dest addresses.
> -     * However, we still *should* figure out why the src and dest
> -     * addresses are sometimes the same. */
> -    memmove(endpoint->rem_info.rem_qps,
> -           rem_info->rem_qps, sizeof(mca_btl_openib_rem_qp_info_t) *
> -           mca_btl_openib_component.num_qps);
> -
>     BTL_VERBOSE(("Setting QP info,  LID = %d", endpoint->rem_info.rem_lid));
>     return OMPI_SUCCESS;
>  }
> @@ -671,7 +666,12 @@
>     uint8_t message_type;
>     bool master;
>
> -    /* start by unpacking data first so we know who is knocking at
> +    /* We later memcpy this whole structure. Make sure
> +       that all the parameters are initialized, especially
> +       the pointers */
> +    memset(&rem_info,0, sizeof(rem_info));
> +
> +   /* start by unpacking data first so we know who is knocking at
>        our door */
>     BTL_VERBOSE(("unpacking %d of %d\n", cnt, OPAL_UINT8));
>     rc = opal_dss.unpack(buffer, &message_type, &cnt, OPAL_UINT8);
> @@ -857,6 +857,7 @@
>                to CONNECTING, and then reply with our QP
>                information */
>             if (master) {
> +               assert(rem_info.rem_qps != NULL);
>                 rc = reply_start_connect(ib_endpoint, &rem_info);
>             } else {
>                 rc = oob_module_start_connect(ib_endpoint->endpoint_local_cpc,
> @@ -879,6 +880,7 @@
>             break;
>
>         case MCA_BTL_IB_CONNECTING :
> +           assert(rem_info.rem_qps != NULL);
>             set_remote_info(ib_endpoint, &rem_info);
>             if (OMPI_SUCCESS != (rc = qp_connect_all(ib_endpoint))) {
>                 BTL_ERROR(("endpoint connect error: %d", rc));
> _______________________________________________
> svn-full mailing list
> svn-full_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/svn-full
>

-- 
Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/
 timattox_at_[hidden] || tmattox_at_[hidden]
    I'm a bright... http://www.the-brights.net/