Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] RFC: fix leak of bml endpoints
From: Nathan Hjelm (hjelmn_at_[hidden])
Date: 2014-05-15 13:52:27


On Thu, May 15, 2014 at 11:44:05AM -0600, Nathan Hjelm wrote:
> On Thu, May 15, 2014 at 01:33:31PM -0400, George Bosilca wrote:
> > The solution you propose here is definitively not OK. It is 1) ugly and 2) break the separation barrier that we hold dear.
>
> Which is why I asked :)
>
> > Regarding your other suggestion I don’t see any reasons not to call the delete_proc on MPI_COMM_WORLD as the last action we do before tearing down everything else.
>
> I spoke too soon. It looks like we *are* calling del_procs but I am not
> seeing the call reach the bml.... I will try and track this down.

/bml/btl/ .. I see what is happening. The proc reference counts are all
larger than 1 when we call del_procs:

[1,2]<stderr>:Deleting proc 0x7b83190 with reference count 5
[1,1]<stderr>:Deleting proc 0x7b83180 with reference count 5
[1,2]<stderr>:Deleting proc 0x7b832b0 with reference count 5
[1,1]<stderr>:Deleting proc 0x7b832a0 with reference count 7
[1,2]<stderr>:Deleting proc 0x7b83360 with reference count 7
[1,1]<stderr>:Deleting proc 0x7b833a0 with reference count 5
[1,0]<stderr>:Deleting proc 0x7b83190 with reference count 7
[1,0]<stderr>:Deleting proc 0x7b83300 with reference count 5
[1,0]<stderr>:Deleting proc 0x7b833b0 with reference count 5

I will track that down.

-Nathan



  • application/pgp-signature attachment: stored