have you tested your ssh key setup, fire wall, and switch settings to ensure all nodes are talking to each other?

On Mon, Dec 27, 2010 at 1:07 AM, Advanced Computing Group University of Padova <acg.unipd@gmail.com> wrote:
using openmpi 1.4.2


On Fri, Dec 24, 2010 at 11:17 AM, Advanced Computing Group University of Padova <acg.unipd@gmail.com> wrote:
Hi,
i am building a small 16 nodes cluster gentoo based.
I succesfully installed openmpi and i succesfully tried some simple small test parallel program on a single host but...
i can't run parallel program on more than one nodes


The nodes are cloned (so they are equals).
The mpiuser (and their ssh certificates) uses /home/mpiuser that is a nfs share.
I modified .bashrc

-------------------------
PATH=/usr/bin:$PATH ; export PATH ; LD_LIBRARY_PATH=/usr/lib64:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH ;

# already present below
if [[ $- != *i* ]] ; then
        # Shell is non-interactive.  Be done now!
        return
fi
---------------------

The very very strange behaviour is that using the --debug-daemons let my program run succesfully.....

Thank you in advance and sorry for my bad english





_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
David Zhang
University of California, San Diego