Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] qsub - mpirun problem
From: Reuti (reuti_at_[hidden])
Date: 2008-09-29 18:58:19


Am 30.09.2008 um 00:30 schrieb Zhiliang Hu:

> At 12:10 AM 9/30/2008 +0200, you wrote:
>
>>>>>>> Can you please try this jobscript instead:
>>>>>>>
>>>>>>> #!/bin/sh
>>>>>>> set | grep PBS
>>>>>>> /path/to/mpirun /path/to/my_program
>>>>>>>
>>>>>>> All should be handled by Open MPI automatically. With the "set"
>>>>>>> bash
>>>>>>> command you will get a list with all defined variables for
>>>>>>> further
>>>>>>> analysis; and where you can check for the variables set by
>>>>>>> Torque.
>>>>>>>
>>>>>>> -- Reuti
>>>>>>
>>>>>> "set | grep PBS" part had nothing in output.
>>>>>
>>>>> Strange - you checked the .o end .e files of the job? - Reuti
>>>>
>>>> There is nothing in -o nor -e output. I had to kill the job.
>>>> I checked torque log, it shows (/var/spool/torque/server_logs):
>>>>
>>>> 09/29/2008 15:52:16;0100;PBS_Server;Job;799.xxx.xxx.xxx;enqueuing
>>>> into default, state 1 hop 1
>>>> 09/29/2008 15:52:16;0008;PBS_Server;Job;799.xxx.xxx.xxx;Job Queued
>>>> at request of zhu_at_xxx.xxx.xxx, owner = zhu_at_xxx.xxx.xxx, job name =
>>>> mpiblastn.sh, queue = default
>>>> 09/29/2008 15:52:16;0040;PBS_Server;Svr;xxx.xxx.xxx;Scheduler sent
>>>> command new
>>>> 09/29/2008 15:52:16;0008;PBS_Server;Job;799.xxx.xxx.xxx;Job
>>>> Modified at request of Scheduler_at_xxx.xxx.xxx
>>>> 09/29/2008 15:52:27;0008;PBS_Server;Job;799.xxx.xxx.xxx;Job
>>>> deleted at request of zhu_at_xxx.xxx.xxx
>>>> 09/29/2008 15:52:27;0100;PBS_Server;Job;799.xxx.xxx.xxx;dequeuing
>>>> from default, state EXITING
>>>> 09/29/2008 15:52:27;0040;PBS_Server;Svr;xxx.xxx.xxx;Scheduler sent
>>>> command term
>>>> 09/29/2008 15:52:47;0001;PBS_Server;Svr;PBS_Server;is_request, bad
>>>> attempt to connect from 172.16.100.1:1021 (address not trusted -
>>>> check entry in server_priv/nodes)
>>
>> As you blank out some addresses: have the nodes and the headnode one
>> or two network cards installed? All the names like node001 et al. are
>> known on neach node by the correct address? I.e. 172.16.100.1 =
>> node001?
>>
>> -- Reuti
>
> There should be no problem in this regard -- the set up is by a
> commercial company.

Okay, then they should solve the problem as you paid for it.

-- Reuti