Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] SM init failures
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2009-04-01 18:29:06


So everyone hates SYSV. Ok. :-)

Given that part of the problems we've been having with mmap have been
due to filesystem issues, should we just unlink() the file once all
processes have mapped it? I believe we didn't do that originally for
two reasons:

- leave it around for debugging purposes
- possibly supporting MPI-2 dynamics someday

We still don't support the sm BTL for dynamics, so why not unlink()?
(I'm probably forgetting something obvious...?)

On Apr 1, 2009, at 5:12 PM, Ashley Pittman wrote:

> On Tue, 2009-03-31 at 11:00 -0400, Jeff Squyres wrote:
> > On Mar 31, 2009, at 3:45 AM, Sylvain Jeaugey wrote:
> > > System V shared memory used to be the main way to do shared
> memory on
> > > MPICH and from my (little) experience, this was truly painful :
> > > - Cleanup issues : does shmctl(IPC_RMID) solve _all_ cases ?
> (even
> > > kill
> > > -9 ?)
> > Indeed. The one saving grace here is that the cleanup issues
> > apparently can be solved on Linux with a special flag that indicates
> > "automatically remove this shmem when all processes attaching to it
> > have died." That was really the impetus for [re-]investigating sysv
> > shm. I, too, remember the sysv pain because we used it in LAM,
> too...
>
> Unless there is something newer than IPC_RMID that I haven't heard of
> this is far from a complete solution, setting RMID causes it to be
> deleted when the attach count becomes zero so it handles the kill -9
> case however it has the down side that once it's been set no further
> processes can attach to the memory so you have to leave a window
> during
> init during which any crash will leave the memory.
>
> I've always been of the opinion that mmaping shared files was a much
> more advanced solution.
>
> Ashley Pittman.
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
Jeff Squyres
Cisco Systems