I have been trying to compile a molecular dynamics program with the Openmpi 1.2.5 included in OFED 1.3. I am running Fedora Core 6; the output of uname –r is 2.6.18-1.2798.fc6. I’ve traced the problems I’ve been having back to openmpi because I’m unable to run the test programs such as glob on more than one node. I currently have 2 nodes connected to an infiniband switch with opensm running on node1. The nodes can ping each other and I am able to ssh between them without a password. My openmpi-default-hostfile includes the following:
node1 slots=2 max-slots=4
node2 slots=4 max-slots=4
When I run “mpirun -np 4 --debug-daemons ./glob” I get:
Daemon [0,0,1] checking in as pid 21341 on host node1
And the program appears to hang. Once I CTRL+C it a couple of times I get the contents of error.txt
Per the instructions in the FAQ I’ve included the output of “ibv_devinfo”, “ifconfig”, and “ulimit –l” in the infiniband_info.txt file. The results of “ompi_info –all is in the ompi_info.txt file.
I’ve been tearing my hear out over this, any help would be greatly appreciated.
JLC-Biomedical/Biotechnology Research Institute
North Carolina Central University
700 George Street
Durham, NC 27707
Phone: (919) 530-7015