Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] kernel 2.6.23 vs 2.6.24 - communication/wait times
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2010-04-06 15:54:27


Sorry for the delay -- I just replied on the user list -- I think the first thing to do is to establish baseline networking performance and see if that is out of whack. If the underlying network is bad, then MPI performance will also be bad.

On Apr 6, 2010, at 11:51 AM, Oliver Geisler wrote:

> On 4/6/2010 10:11 AM, Rainer Keller wrote:
> > Hello Oliver,
> > Hmm, this is really a teaser...
> > I haven't seen such a drastic behavior, and haven't read of any on the list.
> >
> > One thing however, that might interfere is process binding.
> > Could You make sure, that processes are not bound to cores (default in 1.4.1):
> > with mpirun --bind-to-none
> >
>
> I have tried version 1.4.1. Using default settings and watched processes
> switching from core to core in "top" (with "f" + "j"). Then I tried
> --bind-to-core and explicitly --bind-to-none. All with the same result:
> ~20% cpu wait and lot longer over-all computation times.
>
> Thanks for the idea ...
> Every input is helpful.
>
> Oli
>
>
> > Just an idea...
> >
> > Regards,
> > Rainer
> >
> > On Tuesday 06 April 2010 10:07:35 am Oliver Geisler wrote:
> >> Hello Devel-List,
> >>
> >> I am a little bit helpless about this matter. I already posted in the
> >> user list. In case you don't read the users list, I post in here.
> >>
> >> This is the original posting:
> >>
> >> http://www.open-mpi.org/community/lists/users/2010/03/12474.php
> >>
> >> Short:
> >> Switching from kernel 2.6.23 to 2.6.24 (and up), using openmpi 1.2.7-rc2
> >> (I know outdated, but in debian stable, and same results with 1.4.1)
> >> increases communication times between processes (essentially between one
> >> master and several slave processes). This is regardless of whether the
> >> processes are local only or communication is over ethernet.
> >>
> >> Did anybody witness such a behavior?
> >>
> >> Ideas what should I test for?
> >>
> >> What additional information should I provide for you?
> >>
> >> Thanks for your time
> >>
> >> oli
> >>
> >
>
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/