On Thu, 03 May 2012, Rolf vandeVaart wrote:
> I tried your program on a single node and it worked fine.
It works fine on a single node, but deadlocks when it communicates in
between nodes. Single node communication doesn't use tcp by default.
> Yes, TCP message passing in Open MPI has been working well for some
> time.
Ok. Which version(s) of openmpi are you using successfully? [I'm
assuming that this is in an environment which doesn't use IB.]
> 1. Can you run something like hostname successfully (mpirun -np 10
> -hostfile yourhostfile hostname)
Yes, but this only shows that processes start and output is returned,
which doesn't utilize the in-band message passing at all.
> 2. If that works, then you can also run with a debug switch to see
> what connections are being made by MPI.
You can see the connections being made in the attached log:
[archimedes:29820] btl: tcp: attempting to connect() to [[60576,1],2] address 138.23.141.162 on port 2001
> I would suggest reading through here for some ideas and for the
> debug switch.
Thanks. I checked the FAQ, and didn't see anything that shed any
light, unfortunately.
Don Armstrong
--
Fate and Temperament are two words for one and the same concept.
-- Novalis [Hermann Hesse _Demian_]
http://www.donarmstrong.com http://rzlab.ucr.edu
|