Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Bill Wichser (bill_at_[hidden])
Date: 2007-08-06 09:53:20

We have run across an issue, probably more related to openib than to
openmpi but don't know how to resolve.

Linux kernel - 2.6.9-55.0.2.ELsmp x86_64

openmpi - it doesn't matter - 1.1.5 and 1.2.3 both fail.

When the sample code is run across IB nodes, using the IB interface, the
receive just hangs whenever a system call is issued. Removing this
system call removes the hang. Running across the nodes over TCP removes
the hang. Running on a single node removes the hang. Only when using
the IB interface do we have this hang.

So the simple solution is "don't do this" but apparently something
deeper is involved and who knows where it will pop up again.


ps - sample code compiled using mpicc, built with gcc. You'll need a
test.dat file for the system("cp") command.