Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] How do I run OpenMPI safely on a Nehalem standalone machine?
From: Ralph Castain (rhc_at_[hidden])
Date: 2010-05-04 16:45:38


I would certainly try it -mca btl ^sm and see if that solves the problem.

On May 4, 2010, at 2:38 PM, Eugene Loh wrote:

> Gus Correa wrote:
>
>> Dear Open MPI experts
>>
>> I need your help to get Open MPI right on a standalone
>> machine with Nehalem processors.
>>
>> How to tweak the mca parameters to avoid problems
>> with Nehalem (and perhaps AMD processors also),
>> where MPI programs hang, was discussed here before.
>>
>> However, I lost track of the details, how to work around the problem,
>> and if it was fully fixed already perhaps.
>
> Yes, perhaps the problem you're seeing is not what you remember being discussed.
>
> Perhaps you're thinking of https://svn.open-mpi.org/trac/ompi/ticket/2043 . It's presumably fixed.
>
>> I am now facing the problem directly on a single Nehalem box.
>>
>> I installed OpenMPI 1.4.1 from source,
>> and compiled the test hello_c.c with mpicc.
>> Then I tried to run it with:
>>
>> 1) mpirun -np 4 a.out
>> It ran OK (but seemed to be slow).
>>
>> 2) mpirun -np 16 a.out
>> It hung, and brought the machine to a halt.
>>
>> Any words of wisdom are appreciated.
>>
>> More info:
>>
>> * OpenMPI 1.4.1 installed from source (tarball from your site).
>> * Compilers are gcc/g++/gfortran 4.4.3-4.
>> * OS is Fedora Core 12.
>> * The machine is a Dell box with Intel Xeon 5540 (quad core)
>> processors on a two-way motherboard and 48GB of RAM.
>> * /proc/cpuinfo indicates that hyperthreading is turned on.
>> (I can see 16 "processors".)
>>
>> **
>>
>> What should I do?
>>
>> Use -mca btl ^sm ?
>> Use -mca btl -mca btl_sm_num_fifos=some_number ? (Which number?)
>> Use Both?
>> Do something else?
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users