Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] How do I run OpenMPI safely on a Nehalemstandalone machine?
From: Gus Correa (gus_at_[hidden])
Date: 2010-05-07 10:14:34


Hi Jeff, John, Tim

I had asked the same question that John and Tim did,
but it got lost 100 emails ago on this thread.
Here I've been disabling/enabling HT on the BIOS,
as per Douglas Guptill's suggestion.

Jeff: Thank you very much for the wizardry details.
Very helpful for the list subscriber community, I'd guess.
Another reason to install hwloc.

Cheers,
Gus Correa

Jeff Squyres wrote:
> On May 7, 2010, at 1:30 AM, John Hearns wrote:
>
>>> Indeed. I have seen some people have HT enabled in the bios just so that they can have the software option of turning them off via linux -- then you can run with HT and without it and see what it does to your specific codes.
>> I may have missed this on the thread, but how do you do that?
>> The Nehalem systems I have came delivered with HT enabled in the BIOS
>> - I know it is not a real pain to reboot and configure, but it would
>> be a lot easir to leave it on and switch off in software - also if you
>> wanted to do back-to-back testing of performance with/without HT.
>
> What we have done is disable one of the 2 hardware threads as follows:
>
> - download and install hwloc (it's very small/simple to install).
> 1.0rc5 is the current release, but it's *very* near release;
> it's very stable.
> - run lstopo and look at the physical numbering of the
> hardware threads in each core.
> - here's an example output from v1.0rc5 lstopo
> (this is not from a Nehalem machine, but the same things apply):
>
> -----
> # lstopo
> Machine (3945MB)
> Socket #0
> L2 #0 (2048KB) + L1 #0 (16KB) + Core #0
> PU #0 (phys=0)
> PU #1 (phys=4)
> L2 #1 (2048KB) + L1 #1 (16KB) + Core #1
> PU #2 (phys=2)
> PU #3 (phys=6)
> Socket #1
> L2 #2 (2048KB) + L1 #2 (16KB) + Core #2
> PU #4 (phys=1)
> PU #5 (phys=5)
> L2 #3 (2048KB) + L1 #3 (16KB) + Core #3
> PU #6 (phys=3)
> PU #7 (phys=7)
> #
> -----
>
> - you want to disable the 2nd PU (processing unit) -- i.e., hardware thread -- on each core.
> - Do this by echoing 0 to /sys/devices/system/cpu/cpuX/online, where X is each *phys* value.
> - For example:
>
> -----
> # echo 0 > /sys/devices/system/cpu/cpu4/online
> # echo 0 > /sys/devices/system/cpu/cpu5/online
> # echo 0 > /sys/devices/system/cpu/cpu6/online
> # echo 0 > /sys/devices/system/cpu/cpu7/online
> # lstopo
> Machine (3945MB)
> Socket #0
> L2 #0 (2048KB) + L1 #0 (16KB) + Core #0 + PU #0 (phys=0)
> L2 #1 (2048KB) + L1 #1 (16KB) + Core #1 + PU #1 (phys=2)
> Socket #1
> L2 #2 (2048KB) + L1 #2 (16KB) + Core #2 + PU #2 (phys=1)
> L2 #3 (2048KB) + L1 #3 (16KB) + Core #3 + PU #3 (phys=3)
> #
> -----
>
> Granted; this doesn't actually disable hyperthreading. But it does disable Linux from using the 2nd hardware thread on each core, which is pretty much the same thing for the purposes of this conversation.
>