As a quick follow-up to my own post, I just tried this on a few other
1) One system, on which the nodes have only one ethernet device, running the
code with the split "-np" arguments works fine.
2) Another system, which has IB links (as default), runs the code fine.
3) Two very similar systems, each with two ethernet devices on each node
(hence the mca parameters), and on both of these systems the code does
*not*work, giving the connection errors shown earlier.
I'll try a few more things tomorrow, but I have to imagine other people
have seen this, or I'm just missing a crucial mca parameter?
Thanks very much,
Yale University HPC