Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Very poor performance with btl sm on twin nehalem servers with Mellanox ConnectX installed
From: Paul H. Hargrove (PHHargrove_at_[hidden])
Date: 2010-05-14 17:15:53


Oskar Enoksson wrote:
> Christopher Samuel wrote:
>
>> Subject: Re: [OMPI devel] Very poor performance with btl sm on twin
>> nehalem servers with Mellanox ConnectX installed
>> To: devel_at_[hidden]
>> Message-ID:
>> <D45958078CD65C429557B4C5F492B6A60770E51F_at_[hidden]>
>> Content-Type: text/plain; charset="iso-8859-1"
>>
>> On 13/05/10 20:56, Oskar Enoksson wrote:
>>
>>
>>
>>> The problem is that I get very bad performance unless I
>>> explicitly exclude the "sm" btl and I can't figure out why.
>>>
>>>
>> Recently someone reported issues which were traced back to
>> the fact that the files that sm uses for mmap() were in a
>> /tmp which was NFS mounted; changing the location where their
>> files were kept to another directory with the orte_tmpdir_base
>> MCA parameter fixed that issue for them.
>>
>> Could it be similar for yourself ?
>>
>> cheers,
>> Chris
>>
>>
> That was exactly right, as you guessed these are diskless nodes that
> mounts the root filesystem over NFS.
>
> Setting orte_tmpdir_base to /dev/shm and btl_sm_num_fifos=9 and then
> running mpi_stress on eight cores measures speeds of 1650MB/s for both
> 1MB messages and 1600MB/s for 10kB messages.
>
> Thanks!
> /Oskar
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Sounds like a new FAQ entry is warranted.

-Paul

-- 
Paul H. Hargrove                          PHHargrove_at_[hidden]
Future Technologies Group
HPC Research Department                   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900