> Which source checkout did you use? Note that the pls structures have
> likely changed between the OMPI SVN trunk and the v1.2 branch.
> Hmm -- are you saying that you tried compiling the Apple copy of the
> rsh pls and/or the OMPI SVN v1.2.3 rsh pls and neither of them worked?
Yes, I tried both of those and they gave the same bus error. If I'm
reading the stack dump right:
[Rotarran-X-5:04475] Failing at address: 0x0
[ 1] [0xbffff828, 0x00000000] (-P-)
[ 2] (orterun + 0x457) [0xbffff8b8, 0x00001d07]
it's orterun() calling a null pointer.
> I don't rightly know why that wouldn't work -- is there a way to know
> with what compiler flags Apple built Open MPI?
I'm not sure, but I think these are the configure flags they use:
--disable-mpi-f77 --without-cs-fs -enable-mca-no-build=ras-slurm,pls-
slurm,gpr-null,sds-pipe,sds-slurm,pml-cm --mandir=/usr/share/man --
sysconfdir=/usr/share NM="nm -p"
> Can you step through
> mpirun with a debugger to see where it dies? I suspect it may not
> have any debugging symbols, so you might not, but at least you might
> be able to see which pls rsh functions are invoked...? (and more
> importantly, if something is invoked "wrong" in the pls rsh)
Adding some printf's into the pls rsh shows the _init and _open
routines are successfully executing and exiting. I'll see if I can
figure out what part of orterun() is "orterun + 0x457". I have not
attempted to replace orterun/mpirun/etc., only the pls pieces.