All --debug-daemons really does is keep the ssh session open after launching the remote daemon and turn on some output. Otherwise, we close that session as most systems only allow a limited number of concurrent ssh sessions to be open.

I suspect you have a system setting that kills any running job upon ssh close. It would be best if you removed that restriction. If you cannot, then you can always run your MPI jobs with --no-daemonize. This will keep the ssh session open, but without all the debug output.

That flag is just shorthand for an MCA param, so you can set it in your environ or put it in your default MCA param file.


On Dec 28, 2010, at 3:31 AM, Advanced Computing Group University of Padova wrote:

yes i've tested 'em
In fact using the --debug-daemons switch everything works fine! (and i see that on the nodes a process calles orted... is started whenever i launch a test application)
I believe this is a environment variables problem....

On Mon, Dec 27, 2010 at 10:16 PM, David Zhang <solarbikedz@gmail.com> wrote:
have you tested your ssh key setup, fire wall, and switch settings to ensure all nodes are talking to each other?

On Mon, Dec 27, 2010 at 1:07 AM, Advanced Computing Group University of Padova <acg.unipd@gmail.com> wrote:
using openmpi 1.4.2


On Fri, Dec 24, 2010 at 11:17 AM, Advanced Computing Group University of Padova <acg.unipd@gmail.com> wrote:
Hi,
i am building a small 16 nodes cluster gentoo based.
I succesfully installed openmpi and i succesfully tried some simple small test parallel program on a single host but...
i can't run parallel program on more than one nodes


The nodes are cloned (so they are equals).
The mpiuser (and their ssh certificates) uses /home/mpiuser that is a nfs share.
I modified .bashrc

-------------------------
PATH=/usr/bin:$PATH ; export PATH ; LD_LIBRARY_PATH=/usr/lib64:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH ;

# already present below
if [[ $- != *i* ]] ; then
        # Shell is non-interactive.  Be done now!
        return
fi
---------------------

The very very strange behaviour is that using the --debug-daemons let my program run succesfully.....

Thank you in advance and sorry for my bad english





_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
David Zhang
University of California, San Diego

_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users