Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] submitted job stops
From: Danesh Daroui (Danesh.D_at_[hidden])
Date: 2008-04-10 13:31:35


Thanks Rueti. It works now. I just disabled firewall on all machines
since Open-MPI uses random port each time.

Thanks again!

Danesh

Reuti skrev:
> Hi,
>
> Am 09.04.2008 um 22:17 schrieb Danesh Daroui:
>
>> Mark Kosmowski skrev:
>>
>>> Danesh:
>>>
>>> Have you tried "mpirun -np 4 --hostfile hosts hostname" to verify
>>> that
>>> ompi is working?
>>>
>>>
>> When I run "mpirun -np 4 --hostfile hosts hostname" same thing happens
>> and it just hangs. Can it be a clue?
>>
>>
>>> Can you remote access from each node to each other node?
>>>
>>>
>> Yes all nodes can have access to each other via SSH and can login
>> without being prompted for password.
>>
>>
>>> If any node has more than 1 network device, are you using the ompi
>>> options to specify which device to use?
>>>
>>>
>> Each node has one network interface which works properly.
>>
>
> do you have any firewall on the machines, blocking certain ports?
>
> -- Reuti
>
>
>
>> Regards,
>>
>> Danesh
>>
>>
>>
>>> Good luck,
>>>
>>> Mark
>>>
>>>
>>>
>>>> Message: 5
>>>> Date: Wed, 9 Apr 2008 14:15:34 +0200 (CEST)
>>>> From: "danesh.d_at_[hidden]" <danesh.d_at_[hidden]>
>>>> Subject: [OMPI users] Ang: Re: submitted job stops
>>>> To: <users_at_[hidden]>
>>>> Message-ID:
>>>> <24351656.56761207743334738.JavaMail.defaultUser_at_defaultHost>
>>>> Content-Type: text/plain;charset="ISO-8859-15"
>>>>
>>>>
>>>> Actually my program is very simple MPI program "Hello World" which
>>>> just prints rank of each processor and then terminates. When I run
>>>> my program on a single processor machine with e.g 4 processors
>>>> (oversubscribing) it shows:
>>>>
>>>> Hello world from processor with rank 0
>>>> Hello world from processor with rank 3
>>>> Hello world from processor with rank 1
>>>> Hello world from processor with rank 2
>>>>
>>>> but when I use my remote machines everything just stops when
>>>> I run the program.
>>>>
>>>> No I do not use any queuing system. I simply run it like this:
>>>>
>>>> mpirun -np 4 --hostfile hosts ./hw
>>>>
>>>> and then it just tops until I terminate it manually. As I said,
>>>> I monitored all machines (master+2 slaves) and found out that
>>>> in all machines, "orted" daemon starts when I run the program, but
>>>> after few seconds the daemon is terminated. What can be the reason?
>>>>
>>>> Thanks,
>>>>
>>>> Danesh
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>> ----Ursprungligt meddelande----
>>>>> Fr?n: reuti_at_[hidden]
>>>>> Datum: 09-04-2008 13:26
>>>>> Till: "Open MPI Users"<users_at_[hidden]>
>>>>> ?rende: Re: [OMPI users] submitted job stops
>>>>>
>>>>> Hi,
>>>>>
>>>>> Am 08.04.2008 um 21:58 schrieb Danesh Daroui:
>>>>>
>>>>>
>>>>>> I had posted a message about my problem and I did all solutions
>>>>>> but
>>>>>> the
>>>>>> problem is not solved it. The problem is that
>>>>>> I have installed Open-MPI on three machines (1 master+2 slaves).
>>>>>> When I
>>>>>> submit a job to master I can see that
>>>>>> "orted" daemon is launched on all machines (by running "top" on
>>>>>> all
>>>>>> machines) but all "orted" daemons terminate after
>>>>>> few seconds and nothing will happen. First I thought that it
>>>>>> can be
>>>>>> because remote machines can not launch "orted" but
>>>>>> now I am sure that it can be run on all machines without
>>>>>> problem. Any
>>>>>> suggestion?
>>>>>>
>>>>>>
>>>>> the question is more: is your MPI program running successfully
>>>>> or is
>>>>> there simply no output from mpiexec/-run? And: by "submit" you mean
>>>>> you use any queuingsystem?
>>>>>
>>>>> -- Reuti
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>>
>>>>>
>>>>
>>>> ------------------------------
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>> End of users Digest, Vol 863, Issue 1
>>>> *************************************
>>>>
>>>>
>>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>