Have you tried "mpirun -np 4 --hostfile hosts hostname" to verify that
ompi is working?
Can you remote access from each node to each other node?
If any node has more than 1 network device, are you using the ompi
options to specify which device to use?
> Message: 5
> Date: Wed, 9 Apr 2008 14:15:34 +0200 (CEST)
> From: "danesh.d_at_[hidden]" <danesh.d_at_[hidden]>
> Subject: [OMPI users] Ang: Re: submitted job stops
> To: <users_at_[hidden]>
> Content-Type: text/plain;charset="ISO-8859-15"
> Actually my program is very simple MPI program "Hello World" which
> just prints rank of each processor and then terminates. When I run
> my program on a single processor machine with e.g 4 processors
> (oversubscribing) it shows:
> Hello world from processor with rank 0
> Hello world from processor with rank 3
> Hello world from processor with rank 1
> Hello world from processor with rank 2
> but when I use my remote machines everything just stops when
> I run the program.
> No I do not use any queuing system. I simply run it like this:
> mpirun -np 4 --hostfile hosts ./hw
> and then it just tops until I terminate it manually. As I said,
> I monitored all machines (master+2 slaves) and found out that
> in all machines, "orted" daemon starts when I run the program, but
> after few seconds the daemon is terminated. What can be the reason?
> >----Ursprungligt meddelande----
> >Fr?n: reuti_at_[hidden]
> >Datum: 09-04-2008 13:26
> >Till: "Open MPI Users"<users_at_[hidden]>
> >?rende: Re: [OMPI users] submitted job stops
> >Am 08.04.2008 um 21:58 schrieb Danesh Daroui:
> >> I had posted a message about my problem and I did all solutions but
> >> the
> >> problem is not solved it. The problem is that
> >> I have installed Open-MPI on three machines (1 master+2 slaves).
> >> When I
> >> submit a job to master I can see that
> >> "orted" daemon is launched on all machines (by running "top" on all
> >> machines) but all "orted" daemons terminate after
> >> few seconds and nothing will happen. First I thought that it can be
> >> because remote machines can not launch "orted" but
> >> now I am sure that it can be run on all machines without problem. Any
> >> suggestion?
> >the question is more: is your MPI program running successfully or is
> >there simply no output from mpiexec/-run? And: by "submit" you mean
> >you use any queuingsystem?
> >-- Reuti
> >users mailing list
> users mailing list
> End of users Digest, Vol 863, Issue 1