Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] poor btl sm latency
From: Jeffrey Squyres (jsquyres_at_[hidden])
Date: 2012-03-02 08:59:41


Hah! I just saw your ticket about how --with-hwloc=/path/to/install is broken in 1.5.5. So -- let me go look in to that...

On Mar 2, 2012, at 8:58 AM, Jeffrey Squyres wrote:

> Ok. Good that there's no oversubscription bug, at least. :-)
>
> Did you see my off-list mail to you yesterday about building with an external copy of hwloc 1.4 to see if that helps?
>
>
> On Mar 2, 2012, at 8:26 AM, Matthias Jurenz wrote:
>
>> To exclude a possible bug within the LSF component, I rebuilt Open MPI without
>> support for LSF (--without-lsf).
>>
>> -> It makes no difference - the latency is still bad: ~1.1us.
>>
>> Matthias
>>
>> On Friday 02 March 2012 13:50:13 Matthias Jurenz wrote:
>>> SORRY, it was obviously a big mistake by me. :-(
>>>
>>> Open MPI 1.5.5 was built with LSF support, so when starting an LSF job it's
>>> necessary to request at least the number of tasks/cores as used for the
>>> subsequent mpirun command. That was not the case - I forgot the bsub's '-n'
>>> option to specify the number of task, so only *one* task/core was
>>> requested.
>>>
>>> Open MPI 1.4.5 was built *without* LSF support, so the supposed misbehavior
>>> could not happen with it.
>>>
>>> In short, there is no bug in Open MPI 1.5.x regarding to the detection of
>>> oversubscription. Sorry for any confusion!
>>>
>>> Matthias
>>>
>>> On Tuesday 28 February 2012 13:36:56 Matthias Jurenz wrote:
>>>> When using Open MPI v1.4.5 I get ~1.1us. That's the same result as I get
>>>> with Open MPI v1.5.x using mpi_yield_when_idle=0.
>>>> So I think there is a bug in Open MPI (v1.5.4 and v1.5.5rc2) regarding to
>>>> the automatic performance mode selection.
>>>>
>>>> When enabling the degraded performance mode for Open MPI 1.4.5
>>>> (mpi_yield_when_idle=1) I get ~1.8us latencies.
>>>>
>>>> Matthias
>>>>
>>>> On Tuesday 28 February 2012 06:20:28 Christopher Samuel wrote:
>>>>> On 13/02/12 22:11, Matthias Jurenz wrote:
>>>>>> Do you have any idea? Please help!
>>>>>
>>>>> Do you see the same bad latency in the old branch (1.4.5) ?
>>>>>
>>>>> cheers,
>>>>> Chris
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/