Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] System V Shared Memory for Open MPI: Request for Community Input and Testing
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2010-06-09 13:57:24


Iiinnnnteresting.

This, of course, begs the question of whether we should use sysv shmem or not. It seems like the order of preference should be:

- sysv
- mmap in a tmpfs
- mmap in a "regular" (but not networked) fs

The big downer, of course, is the whole "what happens if the job crashes?" issue. With mmap, an rm -rf will clean up any leftover files (although looking for them in /dev/shm might be a bit non-obvious). With sysv, you have to use the ipc* commands to look for and whack any orphan shmem segments.

Right now, the orted/hnp won't clean up any left over sysv segments. This seems like something we should fix.

But even with that, if the orted/hnp is killed, sysv segments can get let over. Hrm.

On Jun 9, 2010, at 11:58 AM, Sylvain Jeaugey wrote:

> As stated at the conf call, I did some performance testing on a 32 cores
> node.
>
> So, here is graph showing 500 timings of an allreduce operation (repeated
> 15,000 times for good timing) with sysv, mmap on /dev/shm and mmap on
> /tmp.
>
> What is shows :
> - sysv has the better performance ;
> - having the mmap file in /dev/shm is very close to sysv. We only have
> +0.1 us for a complete allreduce operation, but it seems stable. The noise
> is identical to sysv (must be OS noise) ;
> - having the mmap file in /tmp (ext3) decreases performance (+0.4 us
> compared to /dev/shm) and seems prone to some "other" noise.
>
> Warning : the graph does not start at 0.
>
> Sylvain
>
> On Tue, 27 Apr 2010, Samuel K. Gutierrez wrote:
>
> > Hi,
> >
> > With Jeff and Ralph's help, I have completed a System V shared memory
> > component for Open MPI. I have conducted some preliminary tests on our
> > systems, but would like to get test results from a broader audience.
> >
> > As it stands, mmap is the defaul, but System V shared memory can be activated
> > using: -mca mpi_common_sm sysv
> >
> > Repository:
> > http://bitbucket.org/samuelkgutierrez/ompi_sysv_sm
> >
> > Input is greatly appreciated!
> >
> > --
> > Samuel K. Gutierrez
> > Los Alamos National Laboratory
> >
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >
>
> <sm-compared.png>_______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/