Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Hyper-thread architecture effect on MPI jobs
From: Saygin Arkan (saygenius_at_[hidden])
Date: 2010-08-12 08:23:17


Hi again,

I think the problem is solved. Thanks to Gus, I've tried
mpirun -mca mpi_paffinity_alone 1
while running the program, and I've made a quick search on that, it assures
that every program works on a specific core I guess.
(correct me if I'm wrong).
I've ran over 20 tests, and now it works fine.

Thanks a lot,

Saygin

On Thu, Aug 12, 2010 at 11:39 AM, Saygin Arkan <saygenius_at_[hidden]> wrote:

> Hi Gus,
>
> 1 - first of all, turning off hyper-threading is not an option. And it
> gives pretty good results if I can find a way to arrange the cores.
>
> 2 - Actually Eugene (one of her messages in this thread) had suggested to
> arrange the slots.
> I did and wrote the results, it delivers the cores randomly, nothing
> changed.
> but I haven't checked loadbalance option. -byslot or -bynode is not gonna
> help.
>
> 3 - Could you give me a bit more detail how affinity works? or what it does
> actually?
>
> Thanks a lot for your suggestions
>
> Saygin
>
>
> On Wed, Aug 11, 2010 at 6:18 PM, Gus Correa <gus_at_[hidden]> wrote:
>
>> Hi Saygin
>>
>> You could:
>>
>> 1) turn off hyperthreading (on BIOS), or
>>
>> 2) use the mpirun options (you didn't send your mpirun command)
>> to distribute the processes across the nodes, cores, etc.
>> "man mpirun" is a good resource, see the explanations about
>> the -byslot, -bynode, -loadbalance options.
>>
>> 3) In addition, you can use the mca parameters to set processor affinity
>> in the mpirun command line "mpirun -mca mpi_paffinity_alone 1 ..."
>> I don't know how this will play in a hyperthreaded machine,
>> but it works fine in our dual processor quad-core computers
>> (not hyperthreaded).
>>
>> Depending on your code, hyperthreading may not help performance anyway.
>>
>> I hope this helps,
>> Gus Correa
>>
>> Saygin Arkan wrote:
>>
>>> Hello,
>>>
>>> I'm running mpi jobs in non-homogeneous cluster. 4 of my machines have
>>> the following properties, os221, os222, os223, os224:
>>>
>>> vendor_id : GenuineIntel
>>> cpu family : 6
>>> model : 23
>>> model name : Intel(R) Core(TM)2 Quad CPU Q9300 @ 2.50GHz
>>> stepping : 7
>>> cache size : 3072 KB
>>> physical id : 0
>>> siblings : 4
>>> core id : 3
>>> cpu cores : 4
>>> fpu : yes
>>> fpu_exception : yes
>>> cpuid level : 10
>>> wp : yes
>>> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
>>> mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall
>>> nx lm constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx smx
>>> est tm2 ssse3 cx16 xtpr sse4_1 lahf_lm
>>> bogomips : 4999.40
>>> clflush size : 64
>>> cache_alignment : 64
>>> address sizes : 36 bits physical, 48 bits virtual
>>>
>>> and the problematic, hyper-threaded 2 machines are as follows, os228 and
>>> os229:
>>>
>>> vendor_id : GenuineIntel
>>> cpu family : 6
>>> model : 26
>>> model name : Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz
>>> stepping : 5
>>> cache size : 8192 KB
>>> physical id : 0
>>> siblings : 8
>>> core id : 3
>>> cpu cores : 4
>>> fpu : yes
>>> fpu_exception : yes
>>> cpuid level : 11
>>> wp : yes
>>> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
>>> mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall
>>> nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl
>>> vmx est tm2 ssse3 cx16 xtpr sse4_1 sse4_2 popcnt lahf_lm ida
>>> bogomips : 5396.88
>>> clflush size : 64
>>> cache_alignment : 64
>>> address sizes : 36 bits physical, 48 bits virtual
>>>
>>>
>>> The problem is: those 2 machines seem to be having 8 cores (virtually,
>>> actualy core number is 4).
>>> When I submit an MPI job, I calculated the comparison times in the
>>> cluster. I got strange results.
>>>
>>> I'm running the job on 6 nodes, 3 core per node. And sometimes ( I can
>>> say 1/3 of the tests) os228 or os229 returns strange results. 2 cores are
>>> slow (slower than the first 4 nodes) but the 3rd core is extremely fast.
>>>
>>> 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - RANK(0) Printing
>>> Times...
>>> 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os221 RANK(1)
>>> :38 sec
>>> 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os222 RANK(2)
>>> :38 sec
>>> 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os224 RANK(3)
>>> :38 sec
>>> 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os228 RANK(4)
>>> :37 sec
>>> 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os229 RANK(5)
>>> :34 sec
>>> 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os223 RANK(6)
>>> :38 sec
>>> 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os221 RANK(7)
>>> :39 sec
>>> 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os222 RANK(8)
>>> :37 sec
>>> 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os224 RANK(9)
>>> :38 sec
>>> 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os228 RANK(10)
>>> :*48 sec*
>>> 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os229 RANK(11)
>>> :35 sec
>>> 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os223 RANK(12)
>>> :38 sec
>>> 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os221 RANK(13)
>>> :37 sec
>>> 2010-08-05 14:30:58,926 50673 DEBUG [0x7fcadf98c740] - os222 RANK(14)
>>> :37 sec
>>> 2010-08-05 14:30:58,926 50673 DEBUG [0x7fcadf98c740] - os224 RANK(15)
>>> :38 sec
>>> 2010-08-05 14:30:58,926 50673 DEBUG [0x7fcadf98c740] - os228 RANK(16)
>>> :*43 sec*
>>> 2010-08-05 14:30:58,926 50673 DEBUG [0x7fcadf98c740] - os229 RANK(17)
>>> :35 sec
>>> TOTAL CORRELATION TIME: 48 sec
>>>
>>>
>>> or another test:
>>>
>>> 2010-08-09 15:28:10,947 272904 DEBUG [0x7f27dec27740] - RANK(0) Printing
>>> Times...
>>> 2010-08-09 15:28:10,947 272904 DEBUG [0x7f27dec27740] - os221 RANK(1)
>>> :170 sec
>>> 2010-08-09 15:28:10,947 272904 DEBUG [0x7f27dec27740] - os222 RANK(2)
>>> :161 sec
>>> 2010-08-09 15:28:10,947 272904 DEBUG [0x7f27dec27740] - os224 RANK(3)
>>> :158 sec
>>> 2010-08-09 15:28:10,947 272904 DEBUG [0x7f27dec27740] - os228 RANK(4)
>>> :142 sec
>>> 2010-08-09 15:28:10,947 272904 DEBUG [0x7f27dec27740] - os229 RANK(5)
>>> :*256 sec*
>>> 2010-08-09 15:28:10,947 272904 DEBUG [0x7f27dec27740] - os223 RANK(6)
>>> :156 sec
>>> 2010-08-09 15:28:10,947 272904 DEBUG [0x7f27dec27740] - os221 RANK(7)
>>> :162 sec
>>> 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - os222 RANK(8)
>>> :159 sec
>>> 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - os224 RANK(9)
>>> :168 sec
>>> 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - os228 RANK(10)
>>> :141 sec
>>> 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - os229 RANK(11)
>>> :136 sec
>>> 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - os223 RANK(12)
>>> :173 sec
>>> 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - os221 RANK(13)
>>> :164 sec
>>> 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - os222 RANK(14)
>>> :171 sec
>>> 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - os224 RANK(15)
>>> :156 sec
>>> 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - os228 RANK(16)
>>> :136 sec
>>> 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - os229 RANK(17)
>>> :*250 sec*
>>> 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - TOTAL CORRELATION
>>> TIME: 256 sec
>>>
>>>
>>> Do you have any idea? Why it is happening?
>>> I assume that it gives 2 jobs to 2 cores in os229, but actually those 2
>>> are one core.
>>> Do you have any idea? If you have, how can I fix it? because the longest
>>> time affects the whole time information. 100 sec delay is too much for 250
>>> sec comparison time,
>>> and it might have finish around 160 sec.
>>>
>>>
>>>
>>> --
>>> Saygin
>>>
>>>
>>> ------------------------------------------------------------------------
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
>
> --
> Saygin
>
>
>

-- 
Saygin