Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2007-08-06 19:53:48


Unless there's something weird going on in the Solaris kernel, the
only memory that we should be leaking after MPI processes exit would
be shared memory files that are [somehow] not getting removed properly.

Right?

On Aug 6, 2007, at 8:15 AM, Ralph H Castain wrote:

> Hmmm...just to clarify as I think there may be some confusion here.
>
> Orte-clean will kill any outstanding Open MPI daemons (which should
> kill
> their local apps) and will cleanup their associated temporary file
> systems.
> If you are having problems with zombied processes or stale daemons,
> then
> this will hopefully help (it isn't perfect, but it helps).
>
> However, orte-clean will not do anything about releasing memory
> that has
> been "leaked" by Open MPI. We don't have any tools for doing that, I'm
> afraid.
>
>
> On 8/6/07 8:08 AM, "Don Kerr" <Don.Kerr_at_[hidden]> wrote:
>
>> Glenn,
>>
>> With CT7 there is a utility which can be used to clean up left over
>> cruft from stale MPI processes.
>>
>> % man -M /opt/SUNWhpc/man -s 1 orte-clean
>>
>> Achtung: This will remove current running jobs as well. Use of "-
>> v" for
>> verbose recommended.
>>
>> I would be curious if this helps.
>>
>> -DON
>> p.s. orte-clean does not exist in the ompi v1.2 branch, it is in the
>> trunk but I think there is an issue with it currently
>>
>> Ralph H Castain wrote:
>>
>>>
>>> On 8/5/07 6:35 PM, "Glenn Carver" <Glenn.Carver_at_[hidden]>
>>> wrote:
>>>
>>>
>>>
>>>> I'd appreciate some advice and help on this one. We're having
>>>> serious problems running parallel applications on our cluster.
>>>> After
>>>> each batch job finishes, we lose a certain amount of available
>>>> memory. Additional jobs cause free memory to gradually go down
>>>> until
>>>> the machine starts swapping and becomes unusable or hangs.
>>>> Taking the
>>>> machine to single user mode doesn't restore the memory, only a
>>>> reboot
>>>> returns all available memory. This happens on all our nodes.
>>>>
>>>> We've been doing some testing to try to pin the problems down,
>>>> although we still don't fully know where the problem is coming
>>>> from.
>>>> We have ruled out our applications (fortran codes); we see the same
>>>> behaviour with Intel's IMB. We know it's not a network issue as a
>>>> parallel job running solely on the 4 cores on each node produces
>>>> the
>>>> same effect. All nodes have been brought up to the very latest OS
>>>> patches and we still see the same problem.
>>>>
>>>> Details: we're running Solaris 10/06, Sun Studio 12, Clustertools 7
>>>> (open-mpi 1.2.1) and Sun Gridengine 6.1. Hardware is Sun X4100/
>>>> X4200.
>>>> Kernel version: SunOS 5.10 Generic_125101-10 on all nodes.
>>>>
>>>> I read in the release notes that a number of memory leaks were
>>>> fixed
>>>> for the 1.2.1 release but none have been noticed since so I'm not
>>>> sure where the problem might be.
>>>>
>>>>
>>>
>>> I'm not sure where that claim came from, but it is certainly not
>>> true that
>>> we haven't noticed any leaks since 1.2.1. We know we have quite a
>>> few memory
>>> leaks in the code base, many of which are small in themselves but
>>> can add up
>>> depending upon exactly what the application does (i.e., which
>>> code paths it
>>> travels). Running a simple hello_world app under valgrind will show
>>> significant unreleased memory.
>>>
>>> I doubt you will see much, if any, improvement in 1.2.4. There
>>> have probably
>>> been a few patches applied, but a comprehensive effort to
>>> eradicate the
>>> problem has not been made. It is something we are trying to
>>> cleanup over
>>> time, but hasn't been a crash priority as most OS's do a fairly
>>> good job of
>>> cleaning up when the app completes.
>>>
>>>
>>>
>>>> My next move is to try the very latest release (probably
>>>> 1.2.4pre-release). As CT7 is built with sun studio 11 rather
>>>> than 12
>>>> which we're using, I might also try downgrading. At the moment
>>>> we're
>>>> rebooting our cluster nodes every day to keep things going. So any
>>>> suggestions are appreciated.
>>>>
>>>> Thanks, Glenn
>>>>
>>>>
>>>>
>>>>
>>>> $ ompi_info
>>>> Open MPI: 1.2.1r14096-ct7b030r1838
>>>> Open MPI SVN revision: 0
>>>> Open RTE: 1.2.1r14096-ct7b030r1838
>>>> Open RTE SVN revision: 0
>>>> OPAL: 1.2.1r14096-ct7b030r1838
>>>> OPAL SVN revision: 0
>>>> Prefix: /opt/SUNWhpc/HPC7.0
>>>> Configured architecture: i386-pc-solaris2.10
>>>> Configured by: root
>>>> Configured on: Fri Mar 30 13:40:12 EDT 2007
>>>> Configure host: burpen-csx10-0
>>>> Built by: root
>>>> Built on: Fri Mar 30 13:57:25 EDT 2007
>>>> Built host: burpen-csx10-0
>>>> C bindings: yes
>>>> C++ bindings: yes
>>>> Fortran77 bindings: yes (all)
>>>> Fortran90 bindings: yes
>>>> Fortran90 bindings size: trivial
>>>> C compiler: cc
>>>> C compiler absolute: /ws/ompi-tools/SUNWspro/SOS11/bin/cc
>>>> C++ compiler: CC
>>>> C++ compiler absolute: /ws/ompi-tools/SUNWspro/SOS11/bin/CC
>>>> Fortran77 compiler: f77
>>>> Fortran77 compiler abs: /ws/ompi-tools/SUNWspro/SOS11/bin/f77
>>>> Fortran90 compiler: f95
>>>> Fortran90 compiler abs: /ws/ompi-tools/SUNWspro/SOS11/bin/f95
>>>> C profiling: yes
>>>> C++ profiling: yes
>>>> Fortran77 profiling: yes
>>>> Fortran90 profiling: yes
>>>> C++ exceptions: yes
>>>> Thread support: no
>>>> Internal debug support: no
>>>> MPI parameter check: runtime
>>>> Memory profiling support: no
>>>> Memory debugging support: no
>>>> libltdl support: yes
>>>> Heterogeneous support: yes
>>>> mpirun default --prefix: yes
>>>> MCA backtrace: printstack (MCA v1.0, API v1.0,
>>>> Component v1.2.1)
>>>> MCA paffinity: solaris (MCA v1.0, API v1.0, Component
>>>> v1.2.1)
>>>> MCA maffinity: first_use (MCA v1.0, API v1.0,
>>>> Component v1.2.1)
>>>> MCA timer: solaris (MCA v1.0, API v1.0, Component
>>>> v1.2.1)
>>>> MCA allocator: basic (MCA v1.0, API v1.0, Component
>>>> v1.0)
>>>> MCA allocator: bucket (MCA v1.0, API v1.0, Component
>>>> v1.0)
>>>> MCA coll: basic (MCA v1.0, API v1.0, Component
>>>> v1.2.1)
>>>> MCA coll: self (MCA v1.0, API v1.0, Component
>>>> v1.2.1)
>>>> MCA coll: sm (MCA v1.0, API v1.0, Component v1.2.1)
>>>> MCA coll: tuned (MCA v1.0, API v1.0, Component
>>>> v1.2.1)
>>>> MCA io: romio (MCA v1.0, API v1.0, Component
>>>> v1.2.1)
>>>> MCA mpool: sm (MCA v1.0, API v1.0, Component v1.2.1)
>>>> MCA mpool: udapl (MCA v1.0, API v1.0, Component
>>>> v1.2.1)
>>>> MCA pml: cm (MCA v1.0, API v1.0, Component v1.2.1)
>>>> MCA pml: ob1 (MCA v1.0, API v1.0, Component
>>>> v1.2.1)
>>>> MCA bml: r2 (MCA v1.0, API v1.0, Component v1.2.1)
>>>> MCA rcache: rb (MCA v1.0, API v1.0, Component v1.2.1)
>>>> MCA rcache: vma (MCA v1.0, API v1.0, Component
>>>> v1.2.1)
>>>> MCA btl: self (MCA v1.0, API v1.0.1, Component
>>>> v1.2.1)
>>>> MCA btl: sm (MCA v1.0, API v1.0.1, Component
>>>> v1.2.1)
>>>> MCA btl: tcp (MCA v1.0, API v1.0.1, Component
>>>> v1.0)
>>>> MCA btl: udapl (MCA v1.0, API v1.0, Component
>>>> v1.2.1)
>>>> MCA topo: unity (MCA v1.0, API v1.0, Component
>>>> v1.2.1)
>>>> MCA osc: pt2pt (MCA v1.0, API v1.0, Component
>>>> v1.2.1)
>>>> MCA errmgr: hnp (MCA v1.0, API v1.3, Component
>>>> v1.2.1)
>>>> MCA errmgr: orted (MCA v1.0, API v1.3, Component
>>>> v1.2.1)
>>>> MCA errmgr: proxy (MCA v1.0, API v1.3, Component
>>>> v1.2.1)
>>>> MCA gpr: null (MCA v1.0, API v1.0, Component
>>>> v1.2.1)
>>>> MCA gpr: proxy (MCA v1.0, API v1.0, Component
>>>> v1.2.1)
>>>> MCA gpr: replica (MCA v1.0, API v1.0, Component
>>>> v1.2.1)
>>>> MCA iof: proxy (MCA v1.0, API v1.0, Component
>>>> v1.2.1)
>>>> MCA iof: svc (MCA v1.0, API v1.0, Component
>>>> v1.2.1)
>>>> MCA ns: proxy (MCA v1.0, API v2.0, Component
>>>> v1.2.1)
>>>> MCA ns: replica (MCA v1.0, API v2.0, Component
>>>> v1.2.1)
>>>> MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
>>>> MCA ras: dash_host (MCA v1.0, API v1.3,
>>>> Component v1.2.1)
>>>> MCA ras: gridengine (MCA v1.0, API v1.3,
>>>> Component v1.2.1)
>>>> MCA ras: localhost (MCA v1.0, API v1.3,
>>>> Component v1.2.1)
>>>> MCA ras: tm (MCA v1.0, API v1.3, Component v1.2.1)
>>>> MCA rds: hostfile (MCA v1.0, API v1.3,
>>>> Component v1.2.1)
>>>> MCA rds: proxy (MCA v1.0, API v1.3, Component
>>>> v1.2.1)
>>>> MCA rds: resfile (MCA v1.0, API v1.3, Component
>>>> v1.2.1)
>>>> MCA rmaps: round_robin (MCA v1.0, API v1.3,
>>>> Component v1.2.1)
>>>> MCA rmgr: proxy (MCA v1.0, API v2.0, Component
>>>> v1.2.1)
>>>> MCA rmgr: urm (MCA v1.0, API v2.0, Component
>>>> v1.2.1)
>>>> MCA rml: oob (MCA v1.0, API v1.0, Component
>>>> v1.2.1)
>>>> MCA pls: gridengine (MCA v1.0, API v1.3,
>>>> Component v1.2.1)
>>>> MCA pls: proxy (MCA v1.0, API v1.3, Component
>>>> v1.2.1)
>>>> MCA pls: rsh (MCA v1.0, API v1.3, Component
>>>> v1.2.1)
>>>> MCA pls: tm (MCA v1.0, API v1.3, Component v1.2.1)
>>>> MCA sds: env (MCA v1.0, API v1.0, Component
>>>> v1.2.1)
>>>> MCA sds: pipe (MCA v1.0, API v1.0, Component
>>>> v1.2.1)
>>>> MCA sds: seed (MCA v1.0, API v1.0, Component
>>>> v1.2.1)
>>>> MCA sds: singleton (MCA v1.0, API v1.0,
>>>> Component v1.2.1)
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>>
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
Cisco Systems