On Jun 10, 2010, at 1:47 AM, Sylvain Jeaugey wrote:

On Wed, 9 Jun 2010, Jeff Squyres wrote:

On Jun 9, 2010, at 3:26 PM, Samuel K. Gutierrez wrote:

System V shared memory cleanup is a concern only if a process dies in
between shmat and shmctl IPC_RMID.  Shared memory segment cleanup
should happen automagically in most cases, including abnormal process

Umm... right.  Duh.  I knew that.


So -- we're good!

Let's open the discussion of making sysv the default on systems that support the IPC_RMID behavior (which, AFAIK, is only Linux)...
I'm sorry, but I think System V has many disadvantages over mmap.

1. As discussed before, cleaning is not as easy as for a file. It is a good thing to remove the shm segment after creation, but since problems often happen during shmget/shmat, there's still a high risk of letting things behind.

2. There are limits in the kernel you need to grow (kernel.shmall, kernel.shmmax).

I agree that this is a disadvantage, but changing shmall and shmmax limits is *only* as painful as having a system admin change a few settings (okay, it's painful ;-) ).

On most linux distribution, shmmax is 32MB, which does not permit the sysv mechanism to work. Mmapped files are unlimited.

Not necessarily true.  If a user *really* wanted to use sysv and their system's shmmax limit was 32MB, they could just add -mca mpool_sm_min_size 33550000 and everything would work properly.  I do understand, however, that this may not be ideal and may have performance implications.

Based on this, I'm leaning towards the default behavior that we currently have in the trunk:

- sysv disabled by default
- use mmap, unless sysv is explicitly requested by the user

3. Each shm segment is identified by a 32 bit integer. This namespace is small (and non-intuitive, as opposed to a file name), and the probability for a collision is not null, especially when you start creating multiple shared memory segments (for collectives, one-sided operations, ...).

I'm not sure if collisions are a problem.  I'm using shmget(IPC_PRIVATE), so I'm guessing once I've asked for more than ~ 2^16 keys, things will fail.

So, I'm a bit reluctant to work with System V mechanisms again. I don't think there is a *real* reason for System V to be faster than mmap, since it should just be memory. I'd rather find out why mmap is slower.

Jeff and I talked, and we are going to hack something together that uses shm_open and friends and incorporates more sophisticated fallback mechanisms if a particular component fails initialization.  Once we are done with that work, would you be willing to conduct another similar performance study that incorporates all sm mechanisms?


Samuel K. Gutierrez
Los Alamos National Laboratory

devel mailing list