Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] shm unlinking
From: Ralph Castain (rhc_at_[hidden])
Date: 2011-04-14 09:07:40

Difficult to follow your thread here, but I think you're wondering about post-job cleanup?

Torque runs an epilogue script on all nodes included in the allocation. It is advisable to always have the epilogue script clean out the tmp directories, assuming single-user use of allocated nodes. If multiple users share nodes, then you can't do that.

OMPI uses the MOMs to start its daemons, and all MPI procs are children of those daemons. So Torque knows what nodes were used, even if it doesn't know about the specific MPI procs. Not that this matters - like I said, the epilogue gets run on all nodes, used or not.

I'm pretty sure the unlink change is in the 1.4 series - not sure precisely when it was made.

On Apr 14, 2011, at 3:39 AM, Rushton Martin wrote:

> I notice from the developer archives that there was some discussion
> about when psm_shm... files ought to be unlinked from /dev/shm:
> ---------------------------------
> Subject: Re: [OMPI devel] System V Shared Memory for Open MPI: Request
> forCommunity Input and Testing
> From: Barrett, Brian W (bwbarre_at_[hidden])
> Date: 2010-06-11 12:53:50
> <snip>
> On Jun 11, 2010, at 5:10 AM, Jeff Squyres wrote:
>> On Jun 11, 2010, at 5:43 AM, Paul H. Hargrove wrote:
>>>> Interesting. Do you think this behavior of the linux kernel would
>>>> change if the file was unlink()ed after attach ?
>>> As Jeff pointed out, the file IS unlinked by Open MPI, presumably to
>>> ensure it is not left behind in case of abnormal termination.
>> I have to admit that I lied. :-(
>> Sam and I were talking on the phone yesterday about the shm_open()
> stuff and to my chagrin, I discovered that the mmap'ed files are *not*
> unlinked in OMPI until MPI_FINALIZE. I'm not actually sure why; I could
> have sworn that we unlinked them after everyone mmap'ed them...
> The idea was one large memory segment for all processes and it wasn't
> unlinked after complete attach so that we could have spawned procs also
> use shmem (which never worked, of course). So I think we could unlink
> during init at this point..
> Brian
> -----------------------------------
> The following reply is also relevant:
> -----------------------------------
> Subject: Re: [OMPI devel] System V Shared Memory for Open MPI: Request
> forCommunity Input and Testing
> From: Jeff Squyres (jsquyres_at_[hidden])
> Date: 2010-06-11 13:06:37
> <snip>
> I could have sworn that we decided that long ago and added the unlink.
> Probably we *did* reach that conclusion long ago, but never actually got
> around to adding the unlink. Sam and I are still in that code area now;
> we might as well add the unlink while we're there.
> -----------------------------------
> Has this change been implemented, and if so in which version? We are
> seeing this behaviour with occasional orphaned files left behind, not a
> good situation on a diskless node. I noted the comment by Paul
> Hargrove:
> -----------------------------------
> While it is not "fair" for Opem MPI to be lazy about its temporary
> resources in the case of normal termination, there will probably always
> be small windows of vulnerability to leakage if one dies in just the
> wrong case (eg a failed assertion between the shmat() and the
> smctl(IPC_RMID)). On the bright side, it is worth noting that a
> properly maintained batch environment should include an epilogue that
> scrubs /tmp, /var/tmp, /usr/tmp, and any other shared writable
> location. Similarly, to prevent a very simple/obvious DOS it should be
> destroying any SysV IPC objects left over by the job.
> -----------------------------------
> I have not yet been able to get evidence of Torque actually running the
> epilogue scripts on sister nodes, something I am perusing in the
> torqueusers mailing list. Briefly, OMPI seems to start the processes on
> remote nodes itself without running Torque's MOM, but it is the MOM that
> runs pro- and epilogues. Obviously this is not directly an OMPI
> problem, until as here there is an assumption made.
> Martin Rushton
> HPC System Manager, Weapons Technologies
> Tel: 01959 514777, Mobile: 07939 219057
> email: jmrushton_at_[hidden]
> QinetiQ - Delivering customer-focused solutions
> Please consider the environment before printing this email.
> This email and any attachments to it may be confidential and are
> intended solely for the use of the individual to whom it is
> addressed. If you are not the intended recipient of this email,
> you must neither take any action based upon its contents, nor
> copy or show it to anyone. Please contact the sender if you
> believe you have received this email in error. QinetiQ may
> monitor email traffic data and also the content of email for
> the purposes of security. QinetiQ Limited (Registered in England
> & Wales: Company Number: 3796233) Registered office: Cody Technology
> Park, Ively Road, Farnborough, Hampshire, GU14 0LX
> _______________________________________________
> users mailing list
> users_at_[hidden]