Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r25431
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2011-11-04 19:18:54


+1 -- we shouldn't be using assert().

On Nov 4, 2011, at 2:15 PM, Tim Mattox wrote:

> I doubt you want to be using assert() here or anywhere in the
> OMPI library. If the rem_info.rem_qps pointer is NULL, I would
> think you want to do something other than just die.
> And, in a production build, if it was to ever be null, things would
> eventually crash right? The ompi library should strive to never crash...
> maybe print an error message and abort, but flat out crashing
> seems wrong.
>
> I've been away from active OMPI development for awhile now so I
> might be remembering this incorrectly, but isn't there some
> policy/SOP on not using assert in the OMPI library code?
>
> Please excuse me while I return to being a silent/lurking ghost
> of an OMPI developer.
>
> On Thu, Nov 3, 2011 at 8:15 PM, <cyeoh_at_[hidden]> wrote:
>> Author: cyeoh
>> Date: 2011-11-03 20:15:08 EDT (Thu, 03 Nov 2011)
>> New Revision: 25431
>> URL: https://svn.open-mpi.org/trac/ompi/changeset/25431
>>
>> Log:
>> Removes pointless memmove which because of a previous memcpy will always
>> have identical source and destination pointers. See #2871
>> Plugs a couple of minor memory leaks related to remote qp info
>>
>>
>> Text files modified:
>> trunk/ompi/mca/btl/openib/btl_openib_endpoint.c | 3 +++
>> trunk/ompi/mca/btl/openib/connect/btl_openib_connect_oob.c | 22 ++++++++++++----------
>> 2 files changed, 15 insertions(+), 10 deletions(-)
>>
>> Modified: trunk/ompi/mca/btl/openib/btl_openib_endpoint.c
>> ==============================================================================
>> --- trunk/ompi/mca/btl/openib/btl_openib_endpoint.c (original)
>> +++ trunk/ompi/mca/btl/openib/btl_openib_endpoint.c 2011-11-03 20:15:08 EDT (Thu, 03 Nov 2011)
>> @@ -452,6 +452,9 @@
>> free(endpoint->qps);
>> endpoint->qps = NULL;
>>
>> + free(endpoint->rem_info.rem_qps);
>> + free(endpoint->rem_info.rem_srqs);
>> +
>> /* unregister xrc recv qp */
>> #if HAVE_XRC
>> if (0 != endpoint->xrc_recv_qp_num) {
>>
>> Modified: trunk/ompi/mca/btl/openib/connect/btl_openib_connect_oob.c
>> ==============================================================================
>> --- trunk/ompi/mca/btl/openib/connect/btl_openib_connect_oob.c (original)
>> +++ trunk/ompi/mca/btl/openib/connect/btl_openib_connect_oob.c 2011-11-03 20:15:08 EDT (Thu, 03 Nov 2011)
>> @@ -272,19 +272,14 @@
>> static int set_remote_info(mca_btl_base_endpoint_t* endpoint,
>> mca_btl_openib_rem_info_t* rem_info)
>> {
>> + /* Free up the memory pointed to by rem_qps before overwriting the pointer
>> + in the following memcpy */
>> + free(endpoint->rem_info.rem_qps);
>> +
>> /* copy the rem_info stuff */
>> memcpy(&((mca_btl_openib_endpoint_t*) endpoint)->rem_info,
>> rem_info, sizeof(mca_btl_openib_rem_info_t));
>>
>> - /* copy over the rem qp info */
>> - /* per #2871, changed this from memcpy() to memmove() to handle
>> - * the case of overlapping (or same) src/dest addresses.
>> - * However, we still *should* figure out why the src and dest
>> - * addresses are sometimes the same. */
>> - memmove(endpoint->rem_info.rem_qps,
>> - rem_info->rem_qps, sizeof(mca_btl_openib_rem_qp_info_t) *
>> - mca_btl_openib_component.num_qps);
>> -
>> BTL_VERBOSE(("Setting QP info, LID = %d", endpoint->rem_info.rem_lid));
>> return OMPI_SUCCESS;
>> }
>> @@ -671,7 +666,12 @@
>> uint8_t message_type;
>> bool master;
>>
>> - /* start by unpacking data first so we know who is knocking at
>> + /* We later memcpy this whole structure. Make sure
>> + that all the parameters are initialized, especially
>> + the pointers */
>> + memset(&rem_info,0, sizeof(rem_info));
>> +
>> + /* start by unpacking data first so we know who is knocking at
>> our door */
>> BTL_VERBOSE(("unpacking %d of %d\n", cnt, OPAL_UINT8));
>> rc = opal_dss.unpack(buffer, &message_type, &cnt, OPAL_UINT8);
>> @@ -857,6 +857,7 @@
>> to CONNECTING, and then reply with our QP
>> information */
>> if (master) {
>> + assert(rem_info.rem_qps != NULL);
>> rc = reply_start_connect(ib_endpoint, &rem_info);
>> } else {
>> rc = oob_module_start_connect(ib_endpoint->endpoint_local_cpc,
>> @@ -879,6 +880,7 @@
>> break;
>>
>> case MCA_BTL_IB_CONNECTING :
>> + assert(rem_info.rem_qps != NULL);
>> set_remote_info(ib_endpoint, &rem_info);
>> if (OMPI_SUCCESS != (rc = qp_connect_all(ib_endpoint))) {
>> BTL_ERROR(("endpoint connect error: %d", rc));
>> _______________________________________________
>> svn-full mailing list
>> svn-full_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/svn-full
>>
>
>
>
> --
> Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/
> timattox_at_[hidden] || tmattox_at_[hidden]
> I'm a bright... http://www.the-brights.net/
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/