Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] kernel 2.6.23 vs 2.6.24 - communication/wait times
From: Oliver Geisler (openmpi_at_[hidden])
Date: 2010-04-06 11:51:44


On 4/6/2010 10:11 AM, Rainer Keller wrote:
> Hello Oliver,
> Hmm, this is really a teaser...
> I haven't seen such a drastic behavior, and haven't read of any on the list.
>
> One thing however, that might interfere is process binding.
> Could You make sure, that processes are not bound to cores (default in 1.4.1):
> with mpirun --bind-to-none
>

I have tried version 1.4.1. Using default settings and watched processes
switching from core to core in "top" (with "f" + "j"). Then I tried
--bind-to-core and explicitly --bind-to-none. All with the same result:
~20% cpu wait and lot longer over-all computation times.

Thanks for the idea ...
Every input is helpful.

Oli

> Just an idea...
>
> Regards,
> Rainer
>
> On Tuesday 06 April 2010 10:07:35 am Oliver Geisler wrote:
>> Hello Devel-List,
>>
>> I am a little bit helpless about this matter. I already posted in the
>> user list. In case you don't read the users list, I post in here.
>>
>> This is the original posting:
>>
>> http://www.open-mpi.org/community/lists/users/2010/03/12474.php
>>
>> Short:
>> Switching from kernel 2.6.23 to 2.6.24 (and up), using openmpi 1.2.7-rc2
>> (I know outdated, but in debian stable, and same results with 1.4.1)
>> increases communication times between processes (essentially between one
>> master and several slave processes). This is regardless of whether the
>> processes are local only or communication is over ethernet.
>>
>> Did anybody witness such a behavior?
>>
>> Ideas what should I test for?
>>
>> What additional information should I provide for you?
>>
>> Thanks for your time
>>
>> oli
>>
>

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.