Just committed r15557 </trac/ompi/changeset/15557> that adds finalize
flow to mpool. So now openib should be able to release
all resources in normal way.
Pavel Shamis (Pasha) wrote:
> Jeff Squyres wrote:
>> Background: Pasha added a call in the openib BTL finalize function
>> that will only succeed if all registered memory has been released
>> (ibv_dealloc_pd()). Since the test app didn't call MPI_FREE_MEM,
>> there was some memory that was still registered, and therefore the
>> call in finalize failed. We treated this as a fatal error. Last
>> night's MTT runs turned up several apps that exhibited this fatal error.
>> While we're examining this problem, Pasha has removed the call to
>> ibv_dealloc_pd() in the trunk openib BTL finalize.
>> I examined 1 of the tests that was failing last night in MTT:
>> onesided/t.f90. This test has an MPI_ALLOC_MEM with no corresponding
>> MPI_FREE_MEM. To investigate this problem, I restored the call to
>> ibv_dealloc_pd() and re-ran the t.f90 test -- the problem still
>> occurs. Good.
>> However, once I got the right MPI_FREE_MEM call in t.f90, the test
>> started passing. I.e., ibv_dealloc_pd(hca->ib_pd) succeeds because
>> all registered memory has been released. Hence, the test itself was
>> However, I don't think we should *error* if we fail to ibv_dealloc_pd
>> (hca->ib_pd); it's a user error, but it's not catastrophic unless
>> we're trying to do an HCA restart scenario. Specifically: during a
>> normal MPI_FINALIZE, who cares?
>> I think we should do the following:
>> 1. If we're not doing an HCA restart/checkpoint and we fail to
>> ibv_dealloc_pd(), just move on (i.e., it's not a warning/error unless
>> we *want* a warning, such as if an MCA parameter
>> btl_openib_warn_if_finalize_fail is enabled, or somesuch).
>> 2. If we *are* doing an HCA restart/checkpoint and ibv_dealloc_pd()
>> fails, then we have to gracefully fail to notify upper layers that
>> Bad Things happened (I suspect that we need mpool finalize
>> implemented to properly implement checkpointing for RDMA networks).
>> 3. Add a new MCA parameter named mpi_show_mpi_alloc_mem_leaks that,
>> when enabled, shows a warning in ompi_mpi_finalize() if there is
>> still memory allocated by MPI_ALLOC_MEM that was not freed by
>> MPI_FREE_MEM (this MCA parameter will parallel the already-existing
>> mpi_show_handle_leaks MCA param which displays warnings if the app
>> creates MPI objects but does not free them).
>> My points:
>> - leaked MPI_ALLOC_MEM memory should be reported by the MPI layer,
>> not a BTL or mpool
>> - failing to ibv_dealloc_pd() during MPI_FINALIZE should only trigger
>> a warning if the user wants to see it
>> - failing to ibv_dealloc_pd() during an HCA restart or checkpoint
>> should gracefully fail upwards
> In addition I will add code that will flush all user data from mpool and
> will allow normal IB finalization.
> devel mailing list