Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] mpirun only works when -np <4 (Gus Correa)
From: Matthew MacManes (macmanes_at_[hidden])
Date: 2009-12-08 19:19:42

Hi Gus,

Thanks for your ideas.. I have a few questions, and will try to answer yours in hopes of solving this!!

Should I worry about setting things like --num-cores --bind-to-cores? This, I think, gets at your questions about processor affinity.. Am I right? I could not exactly figure out the -mca mpi-paffinity_alone stuff...

1. Additional load: nope. nothing else, most of the time not even firefox.
2. RAM: no problems apparent when monitoring through TOP. Interesting, I did wonder about oversubscription, so I tried the option --nooversubscription, but this gave me an error mssage.
3. I have not tried other MPI flavors.. Ive been speaking to the authors of the programs, and they are both using openMPI.
4. I don't think that this is a problem, as I'm specifying --with-mpi=/usr/bin/... when I compile the programs. Is there any other way to be sure that this is not a problem?
5. I had not been, and you could see some shuffling when monitoring the load on specific processors. I have tried to use --bind-to-cores to deal with this. I don't understand how to use the -mca options you asked about.
6. I am using Ubuntu 9.10. gcc 4.4.1 and g++ 4.4.1

MyBayes is a for bayesian phylogenetics:
ABySS: is a program for assembly of DNA sequence data:

> Do the programs mix MPI (message passing) with OpenMP (threads)?
Im honestly not sure what this means..

Thanks for all your help!


> Hi Matthew
> More guesses/questions than anything else:
> 1) Is there any additional load on this machine?
> We had problems like that (on different machines) when
> users start listening to streaming video, doing Matlab calculations,
> etc, while the MPI programs are running.
> This tends to oversubscribe the cores, and may lead to crashes.
> 2) RAM:
> Can you monitor the RAM usage through "top"?
> (I presume you are on Linux.)
> It may show unexpected memory leaks, if they exist.
> On "top", type "1" (one) see all cores, type "f" then "j"
> to see the core number associated to each process.
> 3) Do the programs work right with other MPI flavors (e.g. MPICH2)?
> If not, then it is not OpenMPI's fault.
> 4) Any possibility that the MPI versions/flavors of mpicc and
> mpirun that you are using to compile and launch the program are not the
> same?
> 5) Are you setting processor affinity on mpiexec?
> mpiexec -mca mpi_paffinity_alone 1 -np ... bla, bla ...
> Context switching across the cores may also cause trouble, I suppose.
> 6) Which Linux are you using (uname -a)?
> On other mailing lists I read reports that only quite recent kernels
> support all the Intel Nehalem processor features well.
> I don't have Nehalem, I can't help here,
> but the information may be useful
> for other list subscribers to help you.
> ***
> As for the programs, some programs require specific setup,
> (and even specific compilation) when the number of MPI processes
> vary.
> It may help if you tell us a link to the program sites.
> Baysian statistics is not totally out of our business,
> but phylogenetic genetic trees is not really my league,
> hence forgive me any bad guesses, please,
> but would it need specific compilation or a different
> set of input parameters to run correctly on a different
> number of processors?
> Do the programs mix MPI (message passing) with OpenMP (threads)?
> I found this MrBayes, which seems to do the above:
> As for the ABySS, what is it, where can it be found?
> Doesn't look like a deep ocean circulation model, as the name suggest.
> My $0.02
> Gus Correa