Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] /dev/shm
From: Ralph Castain (rhc_at_[hidden])
Date: 2008-11-19 10:14:38


Hi Ray

Are the jobs that leave files behind terminating normally or aborting?
Are there any warnings/error messages out of mpirun?

Just trying to determine if this is an abnormal termination issue or a
bug in OMPI itself.

Ralph

On Nov 19, 2008, at 8:05 AM, Ray Muno wrote:

> Thought I would revisit this one.
>
> We are still having issues with this. It is not clear to me what is
> leaving the user files behind in /dev/shm.
>
> This is not something users are doing directly, they are just
> compiling their code directly with mpif90 (from OpenMPI), using
> various compilers. Compilers in use are PGI, Intel, SunStudio and
> Pathscale.
>
> It looks like every job run leaves something behind in /dev/shm and
> it slowly fills up. We are just clearing these out at this point.
>
>
> Jeff Squyres wrote:
>> That is odd. Is your user's app crashing or being forcibly
>> killed? The ORTE daemon that is silently launched in v1.2 jobs
>> should ensure that files under /tmp/openmpi-sessions-
>> <userid>@<hostname> are removed.
>> On Nov 10, 2008, at 2:14 PM, Ray Muno wrote:
>>> Brock Palen wrote:
>>>> on most systems /dev/shm is limited to half the physical ram.
>>>> Was the user someone filling up /dev/shm so there was no space?
>>>
>>> The problem is there is a large collection of stale files left in
>>> there by the users that have run on that node (Rocks based cluster).
>>>
>>> I am trying to determine why they are left behind.
>
>
>
> --
>
> Ray Muno
> University of Minnesota
> Aerospace Engineering and Mechanics
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users