This web mail archive is frozen.
This page is part of a frozen web archive of this mailing list.
You can still navigate around this archive, but know that no new mails
have been added to it since July of 2016.
Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.
I'm transitioning from LAM-MPI to OpenMPI and have just compiled OMPI
1.0.2 on OS X server 10.4.6. I'm using gcc 3.3 and XLF (both f77 and
f90), and I'm using ssh to run the jobs. My cluster is all G5 dual
2GHz+ xserves, and I am using both ethernet ports for communication.
One is used for NFS and the other is for MPI.
I've had few problems the past year running this config with LAM-MPI
(latest release). But what worked before doesn't with OpenMPI 1.0.2.
When I run any parallel job that spans multiple machines, the process
runs indefinitely. I've checked this using the BLACS and PBLAS test
routines, the HPL benchmark, and even a simple mpi-pong program. All
of them execute but produce no output past some initial output,
consuming 100% of the CPU on every node that's launched. In contrast,
all of these programs run in a few seconds on a single node, with two
processors, and up to -np 8. When I cntrl-C to stop the program,
openmpi safely stops all the processes, no matter how many machines
have been used.
I noticed a couple postings from the past few months that seem to be
related but didn't seem to be quite the same symptoms. Any ideas what
could be going on?
OpenMPI is a really great project, and it is obvious the quality of
software development that has gone into it. I appreciate all your
help. My config.log and omni-info.out files are attached.
Aerospace Engineering Sciences
University of Colorado