Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI devel] dropping a pls module into an Open MPI build
From: Dean Dauger, Ph. D. (d_at_[hidden])
Date: 2008-01-21 14:57:31

> Which source checkout did you use? Note that the pls structures have
> likely changed between the OMPI SVN trunk and the v1.2 branch.

> Hmm -- are you saying that you tried compiling the Apple copy of the
> rsh pls and/or the OMPI SVN v1.2.3 rsh pls and neither of them worked?

Yes, I tried both of those and they gave the same bus error. If I'm
reading the stack dump right:

[Rotarran-X-5:04475] Failing at address: 0x0
[ 1] [0xbffff828, 0x00000000] (-P-)
[ 2] (orterun + 0x457) [0xbffff8b8, 0x00001d07]

it's orterun() calling a null pointer.

> I don't rightly know why that wouldn't work -- is there a way to know
> with what compiler flags Apple built Open MPI?

I'm not sure, but I think these are the configure flags they use:

--disable-mpi-f77 --without-cs-fs -enable-mca-no-build=ras-slurm,pls-
slurm,gpr-null,sds-pipe,sds-slurm,pml-cm --mandir=/usr/share/man --
sysconfdir=/usr/share NM="nm -p"

> Can you step through
> mpirun with a debugger to see where it dies? I suspect it may not
> have any debugging symbols, so you might not, but at least you might
> be able to see which pls rsh functions are invoked...? (and more
> importantly, if something is invoked "wrong" in the pls rsh)

Adding some printf's into the pls rsh shows the _init and _open
routines are successfully executing and exiting. I'll see if I can
figure out what part of orterun() is "orterun + 0x457". I have not
attempted to replace orterun/mpirun/etc., only the pls pieces.

Thank you,