Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] How do I run OpenMPI safely on a Nehalem standalone machine?
From: Eugene Loh (eugene.loh_at_[hidden])
Date: 2010-05-04 16:38:54


Gus Correa wrote:

> Dear Open MPI experts
>
> I need your help to get Open MPI right on a standalone
> machine with Nehalem processors.
>
> How to tweak the mca parameters to avoid problems
> with Nehalem (and perhaps AMD processors also),
> where MPI programs hang, was discussed here before.
>
> However, I lost track of the details, how to work around the problem,
> and if it was fully fixed already perhaps.

Yes, perhaps the problem you're seeing is not what you remember being
discussed.

Perhaps you're thinking of
https://svn.open-mpi.org/trac/ompi/ticket/2043 . It's presumably fixed.

> I am now facing the problem directly on a single Nehalem box.
>
> I installed OpenMPI 1.4.1 from source,
> and compiled the test hello_c.c with mpicc.
> Then I tried to run it with:
>
> 1) mpirun -np 4 a.out
> It ran OK (but seemed to be slow).
>
> 2) mpirun -np 16 a.out
> It hung, and brought the machine to a halt.
>
> Any words of wisdom are appreciated.
>
> More info:
>
> * OpenMPI 1.4.1 installed from source (tarball from your site).
> * Compilers are gcc/g++/gfortran 4.4.3-4.
> * OS is Fedora Core 12.
> * The machine is a Dell box with Intel Xeon 5540 (quad core)
> processors on a two-way motherboard and 48GB of RAM.
> * /proc/cpuinfo indicates that hyperthreading is turned on.
> (I can see 16 "processors".)
>
> **
>
> What should I do?
>
> Use -mca btl ^sm ?
> Use -mca btl -mca btl_sm_num_fifos=some_number ? (Which number?)
> Use Both?
> Do something else?