Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] RFC: Add an __attribute__((destructor)) function to opal
From: Ralph Castain (rhc_at_[hidden])
Date: 2014-07-18 14:10:20


On Jul 18, 2014, at 10:24 AM, George Bosilca <bosilca_at_[hidden]> wrote:

> 1. If I remember correctly, this topic has already been raised in the Forum. And the decision was to maintain the current behavior (tools and MPI init/fini are independent/disconnected).
>
> 2. Having to manually set a global flag in order to correctly finalize a library is HORRIBLE by any reasonable CS standards.

As I said in my original note, we don't have to set a global flag. All you have to do is decrement the already-existing reference counter that tracks how many times we called init_util, indicating that you are done with it so it can go ahead and truly finalize on next invocation. This is a typical symmetrical operation. All we are doing is correctly communicating to the library that we don't want it to actually tear things down at this time.

>
> 3. Let's not go in shadowy corners of the MPI_T usage, and stay mainstream. Here is a partial snippet of the most usual way the tool interface is supposed to be used.
>
> MPI_T_init_thread(MPI_THREAD_SINGLE, &provided);
> ...
> MPI_Init(&argc, &argv);
> MPI_Finalize();
>
> With the proposed patch, we clean up all OPAL memory as soon as we reach the MPI_Finalize (aka. without the call to MPI_T_finalize).

Are you referring to Nathan's patch? In that case, your statement isn't correct - the destructor only gets run at the end of the user's program, and thus the OPAL memory will not be cleaned up until that time.

> All MPI_T calls after MPI_Finalize will trigger a segfault.
>
> George.
>
>
>
> On Thu, Jul 17, 2014 at 10:55 PM, Ralph Castain <rhc_at_[hidden]> wrote:
> As I said, I don't know which solution is the one to follow - they both have significant "ick" factors, though I wouldn't go so far as to characterize either of them as "horrible". Not being "clean" after calling MPI_Finalize seems just as strange.
>
> Nathan and I did discuss the init-after-finalize issue, and he intends to raise it with the Forum as it doesn't seem a logical thing to do. So that issue may go away. Still leaves us pondering the right solution, and hopefully coming up with something better than either of the ones we have so far.
>
>
> On Jul 17, 2014, at 7:48 PM, George Bosilca <bosilca_at_[hidden]> wrote:
>
>> I think Case #1 is only a partial solution, as it only solves the example attached to the ticket. Based on my reading the the tool chapter calling MPI_T_init after MPI_Finalize is legit, and this case is not covered by the patch. But this is not the major issue I have with this patch. From a coding perspective, it makes the initialization of OPAL horribly unnatural, requiring any other layer using OPAL to make a horrible gymnastic just to tear it down correctly (setting opal_init_util_init_extra to the right value).
>>
>> George.
>>
>>
>>
>> On Wed, Jul 16, 2014 at 11:29 AM, Pritchard, Howard r <howardp_at_[hidden]> wrote:
>> HI Folks,
>>
>> I vote for solution #1. Doesn't change current behavior. Doesn't open the door to becoming dependent on availability of
>> ctor/dtor feature in future toolchains.
>>
>> Howard
>>
>>
>> -----Original Message-----
>> From: devel [mailto:devel-bounces_at_[hidden]] On Behalf Of Nathan Hjelm
>> Sent: Wednesday, July 16, 2014 9:08 AM
>> To: Open MPI Developers
>> Subject: Re: [OMPI devel] RFC: Add an __attribute__((destructor)) function to opal
>>
>> On Wed, Jul 16, 2014 at 07:59:14AM -0700, Ralph Castain wrote:
>> > I discussed this over IM with Nathan to try and get a better understanding of the options. Basically, we have two approaches available to us:
>> >
>> > 1. my solution resolves the segv problem and eliminates leaks so long as the user calls MPI_Init/Finalize after calling the MPI_T init/finalize functions. This method will still leak memory if the user doesn't use MPI after calling the MPI_T functions, but does mean that all memory used by MPI will be released upon MPI_Finalize. So if the user program continues beyond MPI, they won't be carrying the MPI memory footprint with them. This continues our current behavior.
>> >
>> > 2. the destructor method, which release the MPI memory footprint upon final program termination instead of at MPI_Finalize. This also solves the segv and leak problems, and ensures that someone calling only the MPI_T init/finalize functions will be valgrind-clean, but means that a user program that runs beyond MPI will carry the MPI memory footprint with them. This is a change in our current behavior.
>>
>> Correct. Though the only thing we will carry around until termination is the memory associated with opal/mca/if, opal/mca/event, opal_net, opal_malloc, opal_show_help, opal_output, opal_dss, opal_datatype, and opal_class. Not sure how much memory this is.
>>
>> -Nathan
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post: http://www.open-mpi.org/community/lists/devel/2014/07/15172.php
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post: http://www.open-mpi.org/community/lists/devel/2014/07/15193.php
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: http://www.open-mpi.org/community/lists/devel/2014/07/15194.php
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: http://www.open-mpi.org/community/lists/devel/2014/07/15199.php