Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] SM init failures
From: Sylvain Jeaugey (sylvain.jeaugey_at_[hidden])
Date: 2009-03-31 03:45:42

Sorry to continue off-topic but going to System V shm would be for me
like going back in the past.

System V shared memory used to be the main way to do shared memory on
MPICH and from my (little) experience, this was truly painful :
  - Cleanup issues : does shmctl(IPC_RMID) solve _all_ cases ? (even kill
-9 ?)
  - Naming issues : shm segments identified as 32 bits key potentially
causing conflicts between applications or layers of the same application
on one node
  - Space issues : the total shm size on a system is bound to
/proc/sys/kernel/shmmax, needing admin configuration and causing conflicts
between MPI applications running on the same node

Mmap'ed files can have a comprehensive name like <component : MPI>-<Layer
: Opal>-<Jobib>-<Rank>, preventing naming issues. If we are on linux, they
can be allocated in /dev/shm to prevent filesystem trafic, and space is
not limited.


On Mon, 30 Mar 2009, Tim Mattox wrote:

> I've been lurking on this conversation, and I am again left with the impression
> that the underlying shared memory configuration based on sharing a file
> is flawed. Why not use a System V shared memory segment without a
> backing file as I described in ticket #1320?
> On Mon, Mar 30, 2009 at 1:34 PM, George Bosilca <bosilca_at_[hidden]> wrote:
>> Then it looks like the safest solution is the use either ftruncate or the
>> lseek method and then touch the first byte of all memory pages.
>> Unfortunately, I see two problems with this. First, there is a clear
>> performance hit on the startup time. And second, we will have to find a
>> pretty smart way to do this or we will completely break the memory affinity
>> stuff.
>>  george.
>> On Mar 30, 2009, at 13:24 , Iain Bason wrote:
>>> On Mar 30, 2009, at 12:05 PM, Jeff Squyres wrote:
>>>> But don't we need the whole area to be zero filled?
>>> It will be zero-filled on demand using the lseek/touch method.  However,
>>> the OS may not reserve space for the skipped pages or disk blocks.  Thus one
>>> could still get out of memory or file system full errors at arbitrary
>>> points.  Presumably one could also get segfaults from an mmap'ed segment
>>> whose pages couldn't be allocated when the demand came.
>>> Iain
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
> --
> Tim Mattox, Ph.D. -
> tmattox_at_[hidden] || timattox_at_[hidden]
> I'm a bright...
> _______________________________________________
> devel mailing list
> devel_at_[hidden]