Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] mpirun only works when -np <4
From: Gus Correa (gus_at_[hidden])
Date: 2009-12-08 14:54:49

Hi Matthew

More guesses/questions than anything else:

1) Is there any additional load on this machine?
We had problems like that (on different machines) when
users start listening to streaming video, doing Matlab calculations,
etc, while the MPI programs are running.
This tends to oversubscribe the cores, and may lead to crashes.

2) RAM:
Can you monitor the RAM usage through "top"?
(I presume you are on Linux.)
It may show unexpected memory leaks, if they exist.

On "top", type "1" (one) see all cores, type "f" then "j"
to see the core number associated to each process.

3) Do the programs work right with other MPI flavors (e.g. MPICH2)?
If not, then it is not OpenMPI's fault.

4) Any possibility that the MPI versions/flavors of mpicc and
mpirun that you are using to compile and launch the program are not the

5) Are you setting processor affinity on mpiexec?

mpiexec -mca mpi_paffinity_alone 1 -np ... bla, bla ...

Context switching across the cores may also cause trouble, I suppose.

6) Which Linux are you using (uname -a)?

On other mailing lists I read reports that only quite recent kernels
support all the Intel Nehalem processor features well.
I don't have Nehalem, I can't help here,
but the information may be useful
for other list subscribers to help you.


As for the programs, some programs require specific setup,
(and even specific compilation) when the number of MPI processes
It may help if you tell us a link to the program sites.

Baysian statistics is not totally out of our business,
but phylogenetic genetic trees is not really my league,
hence forgive me any bad guesses, please,
but would it need specific compilation or a different
set of input parameters to run correctly on a different
number of processors?
Do the programs mix MPI (message passing) with OpenMP (threads)?

I found this MrBayes, which seems to do the above:

As for the ABySS, what is it, where can it be found?
Doesn't look like a deep ocean circulation model, as the name suggest.

My $0.02
Gus Correa
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA

Matthew MacManes wrote:
> Hi All,
> I am having a problem running a couple of programs, ABySS and MrBayes in parallel. I am using Linux Ubuntu 9.10 with a dual socket (Xeon 5520) machine. There are 8 physical cores, or 16 with hyperthreading enabled.
> I use openMPI version 1.3.4, plus a few other packages downloaded via "apt-get install <program name>"
> 1st of all, let me say that when I specify that -np is less than 4 processors (1, 2, or 3), both programs seem to work as expected. Also, the non-mpi version of each of them works fine. Thus, I am pretty sure that this is a problem with MPI rather that with the program code or something else.
> What happens is simply that the program hangs.. There are no error messages, and there is no clue from anything else (system working fine otherwise- no RAM issues, etc). It does not hang at the same place everytime, sometimes in the very beginning, sometime near the middle..
> Could this an issue with hyperthreading? A conflict with something? I can give you all more info if that would be helpful in troubleshooting. I'm not sure if there are any diagnostics for mpirun, so that would be helpful to know about if there were.
> Thanks. Matt
> _________________________________
> Matthew MacManes
> PhD Candidate
> University of California- Berkeley
> Museum of Vertebrate Zoology
> Phone: 510-495-5833
> Lab Website:
> Personal Website:
> _______________________________________________
> users mailing list
> users_at_[hidden]