Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r25431
From: Ralph Castain (rhc_at_[hidden])
Date: 2011-11-04 19:55:40


FWIW: assert is used throughout the codebase, including orte and opal. I see no reason why it shouldn't be used here as well.

On Nov 4, 2011, at 5:45 PM, Barrett, Brian W wrote:

> Why not? I use asserts all the time in OMPI code. A quick grep in
> ompi/mca says I'm not alone. There are a whole bunch of places where I
> "know" a fact, such as a pointer never being NULL or consistency checks
> between two values. These don't need to run in production; I've
> theoretically tested the crap out of the code before then. But they are
> really useful when developing and debugging later.
>
> Just my $0.02.
>
> Brian
>
> On 11/4/11 5:18 PM, "Jeff Squyres" <jsquyres_at_[hidden]> wrote:
>
>> +1 -- we shouldn't be using assert().
>>
>>
>> On Nov 4, 2011, at 2:15 PM, Tim Mattox wrote:
>>
>>> I doubt you want to be using assert() here or anywhere in the
>>> OMPI library. If the rem_info.rem_qps pointer is NULL, I would
>>> think you want to do something other than just die.
>>> And, in a production build, if it was to ever be null, things would
>>> eventually crash right? The ompi library should strive to never
>>> crash...
>>> maybe print an error message and abort, but flat out crashing
>>> seems wrong.
>>>
>>> I've been away from active OMPI development for awhile now so I
>>> might be remembering this incorrectly, but isn't there some
>>> policy/SOP on not using assert in the OMPI library code?
>>>
>>> Please excuse me while I return to being a silent/lurking ghost
>>> of an OMPI developer.
>>>
>>> On Thu, Nov 3, 2011 at 8:15 PM, <cyeoh_at_[hidden]> wrote:
>>>> Author: cyeoh
>>>> Date: 2011-11-03 20:15:08 EDT (Thu, 03 Nov 2011)
>>>> New Revision: 25431
>>>> URL: https://svn.open-mpi.org/trac/ompi/changeset/25431
>>>>
>>>> Log:
>>>> Removes pointless memmove which because of a previous memcpy will
>>>> always
>>>> have identical source and destination pointers. See #2871
>>>> Plugs a couple of minor memory leaks related to remote qp info
>>>>
>>>>
>>>> Text files modified:
>>>> trunk/ompi/mca/btl/openib/btl_openib_endpoint.c | 3
>>>> +++
>>>> trunk/ompi/mca/btl/openib/connect/btl_openib_connect_oob.c | 22
>>>> ++++++++++++----------
>>>> 2 files changed, 15 insertions(+), 10 deletions(-)
>>>>
>>>> Modified: trunk/ompi/mca/btl/openib/btl_openib_endpoint.c
>>>>
>>>> ========================================================================
>>>> ======
>>>> --- trunk/ompi/mca/btl/openib/btl_openib_endpoint.c (original)
>>>> +++ trunk/ompi/mca/btl/openib/btl_openib_endpoint.c 2011-11-03
>>>> 20:15:08 EDT (Thu, 03 Nov 2011)
>>>> @@ -452,6 +452,9 @@
>>>> free(endpoint->qps);
>>>> endpoint->qps = NULL;
>>>>
>>>> + free(endpoint->rem_info.rem_qps);
>>>> + free(endpoint->rem_info.rem_srqs);
>>>> +
>>>> /* unregister xrc recv qp */
>>>> #if HAVE_XRC
>>>> if (0 != endpoint->xrc_recv_qp_num) {
>>>>
>>>> Modified: trunk/ompi/mca/btl/openib/connect/btl_openib_connect_oob.c
>>>>
>>>> ========================================================================
>>>> ======
>>>> --- trunk/ompi/mca/btl/openib/connect/btl_openib_connect_oob.c
>>>> (original)
>>>> +++ trunk/ompi/mca/btl/openib/connect/btl_openib_connect_oob.c
>>>> 2011-11-03 20:15:08 EDT (Thu, 03 Nov 2011)
>>>> @@ -272,19 +272,14 @@
>>>> static int set_remote_info(mca_btl_base_endpoint_t* endpoint,
>>>> mca_btl_openib_rem_info_t* rem_info)
>>>> {
>>>> + /* Free up the memory pointed to by rem_qps before overwriting
>>>> the pointer
>>>> + in the following memcpy */
>>>> + free(endpoint->rem_info.rem_qps);
>>>> +
>>>> /* copy the rem_info stuff */
>>>> memcpy(&((mca_btl_openib_endpoint_t*) endpoint)->rem_info,
>>>> rem_info, sizeof(mca_btl_openib_rem_info_t));
>>>>
>>>> - /* copy over the rem qp info */
>>>> - /* per #2871, changed this from memcpy() to memmove() to handle
>>>> - * the case of overlapping (or same) src/dest addresses.
>>>> - * However, we still *should* figure out why the src and dest
>>>> - * addresses are sometimes the same. */
>>>> - memmove(endpoint->rem_info.rem_qps,
>>>> - rem_info->rem_qps, sizeof(mca_btl_openib_rem_qp_info_t) *
>>>> - mca_btl_openib_component.num_qps);
>>>> -
>>>> BTL_VERBOSE(("Setting QP info, LID = %d",
>>>> endpoint->rem_info.rem_lid));
>>>> return OMPI_SUCCESS;
>>>> }
>>>> @@ -671,7 +666,12 @@
>>>> uint8_t message_type;
>>>> bool master;
>>>>
>>>> - /* start by unpacking data first so we know who is knocking at
>>>> + /* We later memcpy this whole structure. Make sure
>>>> + that all the parameters are initialized, especially
>>>> + the pointers */
>>>> + memset(&rem_info,0, sizeof(rem_info));
>>>> +
>>>> + /* start by unpacking data first so we know who is knocking at
>>>> our door */
>>>> BTL_VERBOSE(("unpacking %d of %d\n", cnt, OPAL_UINT8));
>>>> rc = opal_dss.unpack(buffer, &message_type, &cnt, OPAL_UINT8);
>>>> @@ -857,6 +857,7 @@
>>>> to CONNECTING, and then reply with our QP
>>>> information */
>>>> if (master) {
>>>> + assert(rem_info.rem_qps != NULL);
>>>> rc = reply_start_connect(ib_endpoint, &rem_info);
>>>> } else {
>>>> rc =
>>>> oob_module_start_connect(ib_endpoint->endpoint_local_cpc,
>>>> @@ -879,6 +880,7 @@
>>>> break;
>>>>
>>>> case MCA_BTL_IB_CONNECTING :
>>>> + assert(rem_info.rem_qps != NULL);
>>>> set_remote_info(ib_endpoint, &rem_info);
>>>> if (OMPI_SUCCESS != (rc = qp_connect_all(ib_endpoint))) {
>>>> BTL_ERROR(("endpoint connect error: %d", rc));
>>>> _______________________________________________
>>>> svn-full mailing list
>>>> svn-full_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/svn-full
>>>>
>>>
>>>
>>>
>>> --
>>> Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/
>>> timattox_at_[hidden] || tmattox_at_[hidden]
>>> I'm a bright... http://www.the-brights.net/
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>> --
>> Jeff Squyres
>> jsquyres_at_[hidden]
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>
>
> --
> Brian W. Barrett
> Dept. 1423: Scalable System Software
> Sandia National Laboratories
>
>
>
>
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel