I don't know if it's the same problem or not (and we haven't tested on Myrinet), but we have one code which frequently hangs on smallish (64 node) runs. I unfortunately haven't been able to deep dive into the problem, but the hang is in a bcast call, where peers are doing sendrecv calls. All but one pair of processes progress fine, that one pair (which seems to differ each run) never completes the sendrecv. It appears that one half completes the send/recv part of the operation, but the other direction never does complete. It's a big enough message that the rendezvous protocol is in use, and the sender thinks its sent the request, and the receiver thinks it hasn't received the request.
Unfortunately, I haven't been able to look into the problem in any detail beyond that. Other projects seem to be consuming all my time lately...
Brian W. Barrett
Scalable System Software Group
Sandia National Laboratories
From: users-bounces_at_[hidden] [users-bounces_at_[hidden]] On Behalf Of Lydia Heck [lydia.heck_at_[hidden]]
Sent: Sunday, May 16, 2010 11:32 AM
Subject: [OMPI users] gadget-3 locks up using openmpi and infiniband (or myrinet)
One of the big cosmology codes is Gadget-3 (Springel et al).
The code uses MPI for interprocess communications. At the ICC in Durham we use
OpenMPI and have been using it for ~3 years.
At the ICC Gadget-3 is one of the major research codes and we have been running
it since it was written and we have observed something which is very worrying:
When running over gigabit using -mca btl tcp,self,sm the code runs alright,
which is good as the largest part of our cluster is over gigabit, and as
Gadget-3 scales rather well, the penalty for running over gigabit is not
We also have a myrinet cluster and on there larger runs freeze. However as
the gigabit cluster was available we have not really investigated this until
We currently have access to an infiniband cluster and we found the following:
in a specfic set of blocked sendrecv section it seems to communicate in pairs
until in the end there is only one pair left processes where it deadlocks.
For that pair the processes have setup
communications, they know about each other's IDs, they know what datatype to
communicate but never communicate that data. The precise timing in the running
is not pinable, i.e. in consecutive runs it does not freeze at the same point
in the run. This is using openmpi and it propagated over different versions of
openmpi (judging from our myrinet experience).
I should mention that the communication on either the myrinet cluster or the
infiniband cluster do work properly as runs of other codes (castep, b_eff)
So my question(s) is (are): has anybody had similar experiences and/or would
anybody have an idea why this could happen and/or what we could do about it?
Dr E L Heck
University of Durham
Institute for Computational Cosmology
Department of Physics
DURHAM, DH1 3LE
Tel.: + 44 191 - 334 3628
Fax.: + 44 191 - 334 3645
users mailing list