I have been trying to compile a molecular dynamics program with the Openmpi 1.2.5 included in OFED 1.3.  I am running Fedora Core 6; the output of uname –r is 2.6.18-1.2798.fc6.  I’ve traced the problems I’ve been having back to openmpi because I’m unable to run the test programs such as glob on more than one node.  I currently have 2 nodes connected to an infiniband switch with opensm running on node1.  The nodes can ping each other and I am able to ssh between them without a password.  My openmpi-default-hostfile  includes the following:

 

node1 slots=2 max-slots=4

node2 slots=4 max-slots=4

 

When I run “mpirun -np 4 --debug-daemons ./glob” I get:

Daemon [0,0,1] checking in as pid 21341 on host node1

And the program appears to hang.  Once I CTRL+C it a couple of times I get the contents of error.txt

 

Per the instructions in the FAQ I’ve included the output of “ibv_devinfo”, “ifconfig”, and “ulimit –l” in the infiniband_info.txt file. The results of “ompi_info –all is in the ompi_info.txt file.

 

I’ve been tearing my hear out over this, any help would be greatly appreciated.

 

James Rudd

JLC-Biomedical/Biotechnology Research Institute

North Carolina Central University

700 George Street

Durham, NC 27707

Phone:  (919) 530-7015

Email:  jrudd@nccu.edu

http://ariel.acc.nccu.edu/Academics/BBRI/personnel/rudd.htm