Do the programs mix MPI (message passing) with OpenMP (threads)?
Hi Matthew
More guesses/questions than anything else:
1) Is there any additional load on this machine?
We had problems like that (on different machines) when
users start listening to streaming video, doing Matlab calculations,
etc, while the MPI programs are running.
This tends to oversubscribe the cores, and may lead to crashes.
2) RAM:
Can you monitor the RAM usage through "top"?
(I presume you are on Linux.)
It may show unexpected memory leaks, if they exist.
On "top", type "1" (one) see all cores, type "f" then "j"
to see the core number associated to each process.
3) Do the programs work right with other MPI flavors (e.g. MPICH2)?
If not, then it is not OpenMPI's fault.
4) Any possibility that the MPI versions/flavors of mpicc and
mpirun that you are using to compile and launch the program are not the
same?
5) Are you setting processor affinity on mpiexec?
mpiexec -mca mpi_paffinity_alone 1 -np ... bla, bla ...
Context switching across the cores may also cause trouble, I suppose.
6) Which Linux are you using (uname -a)?
On other mailing lists I read reports that only quite recent kernels
support all the Intel Nehalem processor features well.
I don't have Nehalem, I can't help here,
but the information may be useful
for other list subscribers to help you.
***
As for the programs, some programs require specific setup,
(and even specific compilation) when the number of MPI processes
vary.
It may help if you tell us a link to the program sites.
Baysian statistics is not totally out of our business,
but phylogenetic genetic trees is not really my league,
hence forgive me any bad guesses, please,
but would it need specific compilation or a different
set of input parameters to run correctly on a different
number of processors?
Do the programs mix MPI (message passing) with OpenMP (threads)?
I found this MrBayes, which seems to do the above:
http://mrbayes.csit.fsu.edu/
http://mrbayes.csit.fsu.edu/wiki/index.php/Main_Page
As for the ABySS, what is it, where can it be found?
Doesn't look like a deep ocean circulation model, as the name suggest.
My $0.02
Gus Correa