I have what is probably a simple question (I hope). I have built
openmpi-1.1.1 from source using gfortran on Mac OS X 10.4.7. I can
run parallel jobs on my own using the mpiexec -np command. My
machinefile contains the lines:
tachyon.a04.aist.go.jp
tachyon.a04.aist.go.jp
gehirn.local
gehirn.local
(the .local uses zeroconfig to find the address of gehirn -- it
works). Running a parallel job on my own machine (-np 2) everything
is fine. The job runs in parallel; it is faster and the output is
correct. When I try running with -np 4 to use an additional g5 dual
cpu machine, my job hangs whilst churning large amounts of cpu
(runaway processes). This continues without output until I break the
process with a ^C (which terminates them on all machines). I am
running the task via ssh using a ssh-agent. Might anyone have any
idea what possibly could be wrong. I have attached my config.log and
ompi_info files (bzip2'ed) to this mail as specified in the mailing
list instructions. This should be a simple thing I am guessing, but
it is taking too much time to figure it out on my own (e.g. I
couldn't find a FAQ or a user question/reply that answered this).
Paul Fons
Script started on Tue Sep 5 16:01:18 2006
[tachyon:exafs/feff85/zno] paulfons% mpiexec -machinefile machinefile
-np 2 host name
tachyon.a04.aist.go.jp
tachyon.a04.aist.go.jp
[tachyon:exafs/feff85/zno] paulfons% mpiexec -machinefile machinefile
-np 2 /opt/feff/feff85/rdinp
Number of processors = 2
Feff 8.40
XANES:
name: zincite ZnO
formula: ZnO
sites: Zn1,O1
refer1: wyckoff, vol 1, ch III, p 111
refer2:
schoen:
notes1:
[tachyon:exafs/feff85/zno] paulfons% mpiexec -machinefile machinefile
-np 2 hostname
tachyon.a04.aist.go.jp
tachyon.a04.aist.go.jp
dhcp054092.a04.aist.go.jp
dhcp054092.a04.aist.go.jp
[tachyon:exafs/feff85/zno] paulfons% mpiexec -machinefile machinefile
-np 4 /opt/feff/feff85/rdinp
Number of processors = 4
Feff 8.40
XANES:
name: zincite ZnO
formula: ZnO
sites: Zn1,O1
refer1: wyckoff, vol 1, ch III, p 111
refer2:
schoen:
notes1:
^Cmpiexec: killing job...
- application/pkcs7-signature attachment: smime.p7s
|