Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Very poor performance with btl sm on twin nehalem servers with Mellanox ConnectX installed
From: Sylvain Jeaugey (sylvain.jeaugey_at_[hidden])
Date: 2010-05-17 04:08:59


I agree with Paul on the fact that a FAQ update would be great on this
subject. /dev/shm seems a good place to put the temporary files (when
available, of course).

Putting files in /dev/shm also showed better performance on our systems,
even with /tmp on a local disk.

Sylvain

On Sun, 16 May 2010, Paul H. Hargrove wrote:

> If I google "ompi sm btl performance" the top match is
> http://www.open-mpi.org/faq/?category=sm
>
> I scanned the entire page from top to bottom and don't see any questions of
> the form
> Why is SM performance slower than ...?
>
> The words "NFS", "network", "file system" or "filesystem" appear nowhere on
> the page. The closest I could find is
>> 7. Where is the file that sm will mmap in?
>>
>> The file will be in the OMPI session directory, which is typically
>> something like /tmp/openmpi-sessions-myusername_at_mynodename* . The file
>> itself will have the name shared_mem_pool.mynodename. For example, the full
>> path could be
>> /tmp/openmpi-sessions-myusername_at_node0_0/1543/1/shared_mem_pool.node0.
>>
>> To place the session directory in a non-default location, use the MCA
>> parameter orte_tmpdir_base.
> which says nothing about where one should or should not place the session
> directory.
>
> Not having read the entire FAQ from start to end, I will not contradict
> Ralph's claim that the "your SM performance might suck if you put the session
> directory on a remote filesystem" FAQ entry does exist, but I will assert
> that I did not find it in the SM section of the FAQ. I tried google on "ompi
> session directory" and "ompi orte_tmpdir_base" and still didn't find whatever
> entry Ralph is talking about. So, I think the average user with no clue
> about the relationship between the SM BLT and the session directory would
> need some help finding it. Therefore, I still feel an FAQ entry in the SM
> category is warranted, even if it just references whatever entry Ralph is
> referring to.
>
> -Paul
>
> Ralph Castain wrote:
>> We have had a FAQ on this for a long time...problem is, nobody reads it :-/
>>
>> Glad you found the problem!
>>
>> On May 14, 2010, at 3:15 PM, Paul H. Hargrove wrote:
>>
>>
>>> Oskar Enoksson wrote:
>>>
>>>> Christopher Samuel wrote:
>>>>
>>>>> Subject: Re: [OMPI devel] Very poor performance with btl sm on twin
>>>>> nehalem servers with Mellanox ConnectX installed
>>>>> To: devel_at_[hidden]
>>>>> Message-ID:
>>>>> <D45958078CD65C429557B4C5F492B6A60770E51F_at_[hidden]>
>>>>> Content-Type: text/plain; charset="iso-8859-1"
>>>>>
>>>>> On 13/05/10 20:56, Oskar Enoksson wrote:
>>>>>
>>>>>
>>>>>> The problem is that I get very bad performance unless I
>>>>>> explicitly exclude the "sm" btl and I can't figure out why.
>>>>>>
>>>>> Recently someone reported issues which were traced back to
>>>>> the fact that the files that sm uses for mmap() were in a
>>>>> /tmp which was NFS mounted; changing the location where their
>>>>> files were kept to another directory with the orte_tmpdir_base
>>>>> MCA parameter fixed that issue for them.
>>>>>
>>>>> Could it be similar for yourself ?
>>>>>
>>>>> cheers,
>>>>> Chris
>>>>>
>>>> That was exactly right, as you guessed these are diskless nodes that
>>>> mounts the root filesystem over NFS.
>>>>
>>>> Setting orte_tmpdir_base to /dev/shm and btl_sm_num_fifos=9 and then
>>>> running mpi_stress on eight cores measures speeds of 1650MB/s for both
>>>> 1MB messages and 1600MB/s for 10kB messages.
>>>>
>>>> Thanks!
>>>> /Oskar
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>> Sounds like a new FAQ entry is warranted.
>>>
>>> -Paul
>>>
>>> --
>>> Paul H. Hargrove PHHargrove_at_[hidden]
>>> Future Technologies Group
>>> HPC Research Department Tel: +1-510-495-2352
>>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>
>
> --
> Paul H. Hargrove PHHargrove_at_[hidden]
> Future Technologies Group Tel: +1-510-495-2352
> HPC Research Department Fax: +1-510-486-6900
> Lawrence Berkeley National Laboratory
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>