Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] ORTE_ERROR_LOG Timeout
From: Rudd, James (jrudd_at_[hidden])
Date: 2008-05-20 12:17:03


I have been trying to compile a molecular dynamics program with the Openmpi 1.2.5 included in OFED 1.3. I am running Fedora Core 6; the output of uname -r is 2.6.18-1.2798.fc6. I've traced the problems I've been having back to openmpi because I'm unable to run the test programs such as glob on more than one node. I currently have 2 nodes connected to an infiniband switch with opensm running on node1. The nodes can ping each other and I am able to ssh between them without a password. My openmpi-default-hostfile includes the following:

node1 slots=2 max-slots=4
node2 slots=4 max-slots=4

When I run "mpirun -np 4 --debug-daemons ./glob" I get:
Daemon [0,0,1] checking in as pid 21341 on host node1
And the program appears to hang. Once I CTRL+C it a couple of times I get the contents of error.txt

Per the instructions in the FAQ I've included the output of "ibv_devinfo", "ifconfig", and "ulimit -l" in the infiniband_info.txt file. The results of "ompi_info -all is in the ompi_info.txt file.

I've been tearing my hear out over this, any help would be greatly appreciated.

James Rudd
JLC-Biomedical/Biotechnology Research Institute
North Carolina Central University
700 George Street
Durham, NC 27707
Phone: (919) 530-7015
Email: jrudd_at_[hidden]<mailto:jrudd_at_[hidden]>
http://ariel.acc.nccu.edu/Academics/BBRI/personnel/rudd.htm