Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Very poor performance with btl sm on twin nehalem servers with Mellanox ConnectX installed
From: Oskar Enoksson (enok_at_[hidden])
Date: 2010-05-14 17:15:57


Christopher Samuel wrote:
> Subject: Re: [OMPI devel] Very poor performance with btl sm on twin
> nehalem servers with Mellanox ConnectX installed
> To: devel_at_[hidden]
> Message-ID:
> <D45958078CD65C429557B4C5F492B6A60770E51F_at_[hidden]>
> Content-Type: text/plain; charset="iso-8859-1"
>
> On 13/05/10 20:56, Oskar Enoksson wrote:
>
>
>> The problem is that I get very bad performance unless I
>> explicitly exclude the "sm" btl and I can't figure out why.
>>
> Recently someone reported issues which were traced back to
> the fact that the files that sm uses for mmap() were in a
> /tmp which was NFS mounted; changing the location where their
> files were kept to another directory with the orte_tmpdir_base
> MCA parameter fixed that issue for them.
>
> Could it be similar for yourself ?
>
> cheers,
> Chris
>
That was exactly right, as you guessed these are diskless nodes that
mounts the root filesystem over NFS.

Setting orte_tmpdir_base to /dev/shm and btl_sm_num_fifos=9 and then
running mpi_stress on eight cores measures speeds of 1650MB/s for both
1MB messages and 1600MB/s for 10kB messages.

Thanks!
/Oskar