Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Chengwen Chen (chenchengwen_at_[hidden])
Date: 2006-07-24 03:47:55


I have tried to use v1.1 openmpi. but the program (AMBER9) I am using can't
be compiled correctly by v1.1. So I seems that I have to keep using
openmpi-1.02.
I am new in linux, I really have no idea about debugger. Would you please
give me some advice to try in a simple way?
Thank you very much!

On 7/6/06, Jeff Squyres (jsquyres) <jsquyres_at_[hidden]> wrote:
>
> Ick. This isn't a helpful error message, is it? :-)
>
> Can you try upgrading to the recently-released v1.1 and see if the error
> is still occurring?
>
> Have you tried running your application through a memory-checking debugger
> such as valgrind, perchance?
>
>
> ------------------------------
> *From:* users-bounces_at_[hidden] [mailto:users-bounces_at_[hidden]] *On
> Behalf Of *Chengwen Chen
> *Sent:* Wednesday, July 05, 2006 3:32 AM
> *To:* Open MPI Users
> *Subject:* Re: [OMPI users] error in running openmpi on remote node
>
>
>
> Thank you very much. This problem is solved when I change the shell of
> remote node to B shell. Because I set the LD_LIBRARY_PATH in .bashrc file
> while the default shell was C shell.
>
> Althoguth it works on my testing program test.x, some errors occured when
> I run other programme. BTW, I tried to run this programme on single PC with
> 2 np successfully.
>
> Any suggestions? Thank you
>
> [say_at_wolf45 tmp]$ mpirun -np 2 --host wolf45,wolf46
> /usr/local/amber9/exe/sander.MPI -O -i /tmp/amber9mintest.in -o
> /tmp/amber9mintest.out -c /tmp/amber9mintest.inpcrd -p
> /tmp/amber9mintest.prmtop -r /tmp/amber9mintest.rst
> [wolf46.chem.cuhk.edu.hk:06002] *** An error occurred in MPI_Barrier
> [ wolf46.chem.cuhk.edu.hk:06002] *** on communicator MPI_COMM_WORLD
> [wolf46.chem.cuhk.edu.hk:06002 ] *** MPI_ERR_INTERN: internal error
> [ wolf46.chem.cuhk.edu.hk:06002] *** MPI_ERRORS_ARE_FATAL (goodbye)
> 1 process killed (possibly by Open MPI)
>
>
>
>
>
>
> On 7/4/06, Brian Barrett <brbarret_at_[hidden] > wrote:
> >
> > On Jul 4, 2006, at 1:53 AM, Chengwen Chen wrote:
> >
> > > Dear openmpi users,
> > >
> > > I am using openmpi-1.0.2 on Redhat linux. I can succussfully run
> > > mpirun in single PC with 2 np. But fail in remote node. Can you
> > > give me some advices? thank you very much in advance.
> > >
> > > [say_at_wolf45 tmp]$ mpirun -np 2 /tmp/test.x
> > >
> > > [say_at_wolf45 tmp]$ mpirun -np 2 --host wolf45,wolf46 /tmp/test.x
> > > say_at_wolf46's password:
> > > orted: Command not found.
> > > [wolf45:11357] ERROR: A daemon on node wolf46 failed to start as
> > > expected.
> > > [wolf45:11357] ERROR: There may be more information available from
> > > [wolf45:11357] ERROR: the remote shell (see above).
> > > [wolf45:11357] ERROR: The daemon exited unexpectedly with status 1.
> >
> > Kefeng is correct that you should setup your ssh keys so that you
> > aren't prompted for a password, but that isn't the cause of your
> > failure. The problem appears to be that orted (one of the Open MPI
> > commands) is not in your path on the remote node. You should take a
> > look at one of the other FAQ sections on the setup required for Open
> > MPI in an rsh/ssh type environment.
> >
> > http://www.open-mpi.org/faq/?category=running
> >
> >
> > Hope this helps,
> >
> > Brian
> >
> > --
> > Brian Barrett
> > Open MPI developer
> > http://www.open-mpi.org/
> >
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>