Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] System V Shared Memory for Open MPI: Request forCommunity Input and Testing
From: Samuel K. Gutierrez (samuel_at_[hidden])
Date: 2010-06-10 10:59:52

On Jun 10, 2010, at 1:47 AM, Sylvain Jeaugey wrote:

> On Wed, 9 Jun 2010, Jeff Squyres wrote:
>> On Jun 9, 2010, at 3:26 PM, Samuel K. Gutierrez wrote:
>>> System V shared memory cleanup is a concern only if a process dies
>>> in
>>> between shmat and shmctl IPC_RMID. Shared memory segment cleanup
>>> should happen automagically in most cases, including abnormal
>>> process
>>> termination.
>> Umm... right. Duh. I knew that.
>> Really.
>> So -- we're good!
>> Let's open the discussion of making sysv the default on systems
>> that support the IPC_RMID behavior (which, AFAIK, is only Linux)...
> I'm sorry, but I think System V has many disadvantages over mmap.
> 1. As discussed before, cleaning is not as easy as for a file. It is
> a good thing to remove the shm segment after creation, but since
> problems often happen during shmget/shmat, there's still a high risk
> of letting things behind.
> 2. There are limits in the kernel you need to grow (kernel.shmall,
> kernel.shmmax).

I agree that this is a disadvantage, but changing shmall and shmmax
limits is *only* as painful as having a system admin change a few
settings (okay, it's painful ;-) ).

> On most linux distribution, shmmax is 32MB, which does not permit
> the sysv mechanism to work. Mmapped files are unlimited.

Not necessarily true. If a user *really* wanted to use sysv and their
system's shmmax limit was 32MB, they could just add -mca
mpool_sm_min_size 33550000 and everything would work properly. I do
understand, however, that this may not be ideal and may have
performance implications.

Based on this, I'm leaning towards the default behavior that we
currently have in the trunk:

- sysv disabled by default
- use mmap, unless sysv is explicitly requested by the user

> 3. Each shm segment is identified by a 32 bit integer. This
> namespace is small (and non-intuitive, as opposed to a file name),
> and the probability for a collision is not null, especially when you
> start creating multiple shared memory segments (for collectives, one-
> sided operations, ...).

I'm not sure if collisions are a problem. I'm using
shmget(IPC_PRIVATE), so I'm guessing once I've asked for more than ~
2^16 keys, things will fail.

> So, I'm a bit reluctant to work with System V mechanisms again. I
> don't think there is a *real* reason for System V to be faster than
> mmap, since it should just be memory. I'd rather find out why mmap
> is slower.

Jeff and I talked, and we are going to hack something together that
uses shm_open and friends and incorporates more sophisticated fallback
mechanisms if a particular component fails initialization. Once we
are done with that work, would you be willing to conduct another
similar performance study that incorporates all sm mechanisms?


Samuel K. Gutierrez
Los Alamos National Laboratory
> Sylvain
> _______________________________________________
> devel mailing list
> devel_at_[hidden]