Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Very poor performance with btl sm on twin nehalem servers with Mellanox ConnectX installed
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2010-05-17 17:02:42


How's this?

    http://www.open-mpi.org/faq/?category=sm#poor-sm-btl-performance

What's the advantage of /dev/shm? (I don't know anything about /dev/shm)

On May 17, 2010, at 4:08 AM, Sylvain Jeaugey wrote:

> I agree with Paul on the fact that a FAQ update would be great on this
> subject. /dev/shm seems a good place to put the temporary files (when
> available, of course).
>
> Putting files in /dev/shm also showed better performance on our systems,
> even with /tmp on a local disk.
>
> Sylvain
>
> On Sun, 16 May 2010, Paul H. Hargrove wrote:
>
> > If I google "ompi sm btl performance" the top match is
> > http://www.open-mpi.org/faq/?category=sm
> >
> > I scanned the entire page from top to bottom and don't see any questions of
> > the form
> > Why is SM performance slower than ...?
> >
> > The words "NFS", "network", "file system" or "filesystem" appear nowhere on
> > the page. The closest I could find is
> >> 7. Where is the file that sm will mmap in?
> >>
> >> The file will be in the OMPI session directory, which is typically
> >> something like /tmp/openmpi-sessions-myusername_at_mynodename* . The file
> >> itself will have the name shared_mem_pool.mynodename. For example, the full
> >> path could be
> >> /tmp/openmpi-sessions-myusername_at_node0_0/1543/1/shared_mem_pool.node0.
> >>
> >> To place the session directory in a non-default location, use the MCA
> >> parameter orte_tmpdir_base.
> > which says nothing about where one should or should not place the session
> > directory.
> >
> > Not having read the entire FAQ from start to end, I will not contradict
> > Ralph's claim that the "your SM performance might suck if you put the session
> > directory on a remote filesystem" FAQ entry does exist, but I will assert
> > that I did not find it in the SM section of the FAQ. I tried google on "ompi
> > session directory" and "ompi orte_tmpdir_base" and still didn't find whatever
> > entry Ralph is talking about. So, I think the average user with no clue
> > about the relationship between the SM BLT and the session directory would
> > need some help finding it. Therefore, I still feel an FAQ entry in the SM
> > category is warranted, even if it just references whatever entry Ralph is
> > referring to.
> >
> > -Paul
> >
> > Ralph Castain wrote:
> >> We have had a FAQ on this for a long time...problem is, nobody reads it :-/
> >>
> >> Glad you found the problem!
> >>
> >> On May 14, 2010, at 3:15 PM, Paul H. Hargrove wrote:
> >>
> >>
> >>> Oskar Enoksson wrote:
> >>>
> >>>> Christopher Samuel wrote:
> >>>>
> >>>>> Subject: Re: [OMPI devel] Very poor performance with btl sm on twin
> >>>>> nehalem servers with Mellanox ConnectX installed
> >>>>> To: devel_at_[hidden]
> >>>>> Message-ID:
> >>>>> <D45958078CD65C429557B4C5F492B6A60770E51F_at_[hidden]>
> >>>>> Content-Type: text/plain; charset="iso-8859-1"
> >>>>>
> >>>>> On 13/05/10 20:56, Oskar Enoksson wrote:
> >>>>>
> >>>>>
> >>>>>> The problem is that I get very bad performance unless I
> >>>>>> explicitly exclude the "sm" btl and I can't figure out why.
> >>>>>>
> >>>>> Recently someone reported issues which were traced back to
> >>>>> the fact that the files that sm uses for mmap() were in a
> >>>>> /tmp which was NFS mounted; changing the location where their
> >>>>> files were kept to another directory with the orte_tmpdir_base
> >>>>> MCA parameter fixed that issue for them.
> >>>>>
> >>>>> Could it be similar for yourself ?
> >>>>>
> >>>>> cheers,
> >>>>> Chris
> >>>>>
> >>>> That was exactly right, as you guessed these are diskless nodes that
> >>>> mounts the root filesystem over NFS.
> >>>>
> >>>> Setting orte_tmpdir_base to /dev/shm and btl_sm_num_fifos=9 and then
> >>>> running mpi_stress on eight cores measures speeds of 1650MB/s for both
> >>>> 1MB messages and 1600MB/s for 10kB messages.
> >>>>
> >>>> Thanks!
> >>>> /Oskar
> >>>>
> >>>> _______________________________________________
> >>>> devel mailing list
> >>>> devel_at_[hidden]
> >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >>>>
> >>> Sounds like a new FAQ entry is warranted.
> >>>
> >>> -Paul
> >>>
> >>> --
> >>> Paul H. Hargrove PHHargrove_at_[hidden]
> >>> Future Technologies Group
> >>> HPC Research Department Tel: +1-510-495-2352
> >>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> >>>
> >>> _______________________________________________
> >>> devel mailing list
> >>> devel_at_[hidden]
> >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >>>
> >>
> >> _______________________________________________
> >> devel mailing list
> >> devel_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >>
> >
> >
> > --
> > Paul H. Hargrove PHHargrove_at_[hidden]
> > Future Technologies Group Tel: +1-510-495-2352
> > HPC Research Department Fax: +1-510-486-6900
> > Lawrence Berkeley National Laboratory
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >
> >
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/