Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] memory manager RFC
From: Paul H. Hargrove (PHHargrove_at_[hidden])
Date: 2008-06-04 15:26:14


Brian states
> This will
> allow users to turn ptmalloc2 support on/off at application link time
> instead of MPI compile time.
>
Where I assume "MPI compile time" means when the MPI *implementation* is
compiled.

So what about LD_PRELOAD? Can the user defer the decision to use
ptmalloc until application launch?
If so, this begs the question of an mpirun option to "enable
leave_pinned, placing libompi-malloc.so in LD_PRELOAD if required".
Can/will/should such an option exist?

-Paul

Brian W. Barrett wrote:
> Hi all -
>
> Sorry this is so late, but it took a couple of iterations with a couple of
> people to get right from a technology standpoint. All mistakes in this
> proposal are my fault.
>
> What: Fix the ptmalloc2 problem
> How: Remove it from the default path
> When: This weekend? For the 1.3 branch
>
> The problem: On Linux today, we by default build a copy of ptmalloc2 into
> libopen-pal.so so that RDMA networks can get better bandwidth using
> leave_pinned. Normally users don't use or need leave_pinned, but we need
> to have it available for benchmarks and the few apps that gain by having
> the more independent-ish progress. However, by having it there, we're
> screwing with the memory manager, which has a number of bad side effects.
> First, it can cause numerous crashes if the user is providing his/her own
> allocator. Second, there is growing evidence that the ptmalloc2 in Open
> MPI has an evil corner case we can't pinn down that causes explosive
> growth in memory utilization.
>
> The proposal: Remove ptmalloc2 from libopen-pal.so and make it a
> standalone library (for forward compatibility, currently called
> libompi-malloc.so), which the user can explicitly link in. This will
> allow users to turn ptmalloc2 support on/off at application link time
> instead of MPI compile time. Given the limited number of leave_pinned
> users, this seems to be a good compromise for the near-term between
> greater stability for most users and fast performance for power users.
> The mallopt() solution, which means free() never gives memory back to the
> OS (but does reuse it), which works well for benchmarks, will still be
> available at all times.
>
> The work: Some autoconf magic to move most (but not all -- in particular
> the munmap() support) of the ptmalloc2 component into its own library.
> This is extremely low risk, and actually lowers the risk of Open MPI
> breaking by removing code from the critical path. There will also be a
> small number of enhancements to the mpool base code to better detect
> situations where leave_pinned is used by we can't sense giving memory back
> to the OS.
>
> I'd like this for 1.3, as we're running into more and more situations
> where this code isn't working. Also, the lone supporter of the ptmallco2
> code (me) doesn't want to do it anymore and removing the code from the
> critical path will lower the workload of me (ie, the retired guy who's
> doing this for fun).
>
> If you have objections, please let me know before Friday. I'd like to
> commit these changes to the trunk this weekend.
>
> Brian
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

-- 
Paul H. Hargrove                          PHHargrove_at_[hidden]
Future Technologies Group                 
HPC Research Department                   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900