Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] submitted job stops
From: Danesh Daroui (Danesh.D_at_[hidden])
Date: 2008-04-09 16:17:59


Mark Kosmowski skrev:
> Danesh:
>
> Have you tried "mpirun -np 4 --hostfile hosts hostname" to verify that
> ompi is working?
>

When I run "mpirun -np 4 --hostfile hosts hostname" same thing happens
and it just hangs. Can it be a clue?

> Can you remote access from each node to each other node?
>
Yes all nodes can have access to each other via SSH and can login
without being prompted for password.

> If any node has more than 1 network device, are you using the ompi
> options to specify which device to use?
>

Each node has one network interface which works properly.

Regards,

Danesh

> Good luck,
>
> Mark
>
>
>> Message: 5
>> Date: Wed, 9 Apr 2008 14:15:34 +0200 (CEST)
>> From: "danesh.d_at_[hidden]" <danesh.d_at_[hidden]>
>> Subject: [OMPI users] Ang: Re: submitted job stops
>> To: <users_at_[hidden]>
>> Message-ID:
>> <24351656.56761207743334738.JavaMail.defaultUser_at_defaultHost>
>> Content-Type: text/plain;charset="ISO-8859-15"
>>
>>
>> Actually my program is very simple MPI program "Hello World" which
>> just prints rank of each processor and then terminates. When I run
>> my program on a single processor machine with e.g 4 processors
>> (oversubscribing) it shows:
>>
>> Hello world from processor with rank 0
>> Hello world from processor with rank 3
>> Hello world from processor with rank 1
>> Hello world from processor with rank 2
>>
>> but when I use my remote machines everything just stops when
>> I run the program.
>>
>> No I do not use any queuing system. I simply run it like this:
>>
>> mpirun -np 4 --hostfile hosts ./hw
>>
>> and then it just tops until I terminate it manually. As I said,
>> I monitored all machines (master+2 slaves) and found out that
>> in all machines, "orted" daemon starts when I run the program, but
>> after few seconds the daemon is terminated. What can be the reason?
>>
>> Thanks,
>>
>> Danesh
>>
>>
>>
>>
>>> ----Ursprungligt meddelande----
>>> Fr?n: reuti_at_[hidden]
>>> Datum: 09-04-2008 13:26
>>> Till: "Open MPI Users"<users_at_[hidden]>
>>> ?rende: Re: [OMPI users] submitted job stops
>>>
>>> Hi,
>>>
>>> Am 08.04.2008 um 21:58 schrieb Danesh Daroui:
>>>
>>>> I had posted a message about my problem and I did all solutions but
>>>> the
>>>> problem is not solved it. The problem is that
>>>> I have installed Open-MPI on three machines (1 master+2 slaves).
>>>> When I
>>>> submit a job to master I can see that
>>>> "orted" daemon is launched on all machines (by running "top" on all
>>>> machines) but all "orted" daemons terminate after
>>>> few seconds and nothing will happen. First I thought that it can be
>>>> because remote machines can not launch "orted" but
>>>> now I am sure that it can be run on all machines without problem. Any
>>>> suggestion?
>>>>
>>> the question is more: is your MPI program running successfully or is
>>> there simply no output from mpiexec/-run? And: by "submit" you mean
>>> you use any queuingsystem?
>>>
>>> -- Reuti
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>
>>
>>
>> ------------------------------
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> End of users Digest, Vol 863, Issue 1
>> *************************************
>>
>>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>