Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] How do I run OpenMPI safely on a Nehalem standalone machine?
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2010-05-04 22:06:32


I'd actually be a little surprised if HT was the problem. I run with HT enabled on my nehalem boxen all the time. It's pretty surprising that Open MPI is causing a hard lockup of your system; user-level processes shouldn't be able to do that.

Notes:

1. With HT enabled, as you noted, Linux will just see 2x as many cores as you actually have. Depending on your desired workload, this may or may not help you. But that shouldn't affect the correctness of running your MPI application.

2. To confirm: yes, TCP will be quite a bit slower than sm (but again, that depends on how much MPI traffic you're sending).

3. Yes, you can disable the 2nd thread on each core via Linux, but you need root-level access to do it.

Some questions:

- is the /tmp directory on your local disk?
- are there any revealing messages in /var/log/messages (or equivalent) about failures when the machine hangs?

On May 4, 2010, at 8:35 PM, Gus Correa wrote:

> Hi Douglas
>
> Yes, very helpful indeed!
>
> The machine here is a two-way quad-core, and /proc/cpuinfo shows 16
> processors, twice as much as the physical cores,
> just like you see on yours.
> So, HT is turned on for sure.
>
> The security guard opened the office door for me,
> and I could reboot that machine.
> It's called Spinoza. Maybe that's why it is locked.
> Now the door is locked again, so I will have to wait until tomorrow
> to play around with the BIOS settings.
>
> I will remember the BIOS double negative that you pointed out:
> "When Disabled only one thread per core is enabled"
> Ain't that English funny?
> So far, I can't get no satisfaction.
> Hence, let's see if Ralph's suggestion works.
> Never get no hyperthreading turned on,
> and you ain't have no problems with Open MPI. :)
>
> Many thanks!
> Have a great Halifax Spring time!
>
> Cheers,
> Gus
>
> Douglas Guptill wrote:
> > On Tue, May 04, 2010 at 05:34:40PM -0600, Ralph Castain wrote:
> >> On May 4, 2010, at 4:51 PM, Gus Correa wrote:
> >>
> >>> Hi Ralph
> >>>
> >>> Ralph Castain wrote:
> >>>> One possibility is that the sm btl might not like that you have hyperthreading enabled.
> >>> I remember that hyperthreading was discussed months ago,
> >>> in the previous incarnation of this problem/thread/discussion on "Nehalem vs. Open MPI".
> >>> (It sounds like one of those supreme court cases ... )
> >>>
> >>> I don't really administer that machine,
> >>> or any machine with hyperthreading,
> >>> so I am not much familiar to the HT nitty-gritty.
> >>> How do I turn off hyperthreading?
> >>> Is it a BIOS or a Linux thing?
> >>> I may try that.
> >> I believe it can be turned off via an admin-level cmd, but I'm not certain about it
> >
> > The challenge was too great to resist, so I yielded, and rebooted my
> > Nehalem (Core i7 920 @ 2.67 GHz) to confirm my thoughts on the issue.
> >
> > Entering the BIOS setup by pressing "DEL", and "right-arrowing" over
> > to "Advanced", then "down arrow" to "CPU configuration", I found a
> > setting called "Intel (R) HT Technology". The help dialogue says
> > "When Disabled only one thread per core is enabled".
> >
> > Mine is "Enabled", and I see 8 cpus. The Core i7, to my
> > understanding, is a 4 core chip.
> >
> > Hope that helps,
> > Douglas.
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/