Subject: Re: [OMPI users] qsub - mpirun problem
From: Zhiliang Hu (zhu_at_[hidden])
Date: 2008-09-29 17:12:00

At 10:45 PM 9/29/2008 +0200, you wrote:
>Am 29.09.2008 um 22:33 schrieb Zhiliang Hu:
>>At 07:37 PM 9/29/2008 +0200, Reuti wrote:
>>>>"-l nodes=6:ppn=2" is all I have to specify the node requests:
>>>this might help:
>>Essentially the examples given on this web is no difference from
>>what I did.
>>Only thing new is, I suppose "qsub -I " is for interactive mode.
>>When I did this:
>> qsub -I -l nodes=7
>>It hangs on "qsub: waiting for job to
>>>>UNIX_PROMPT> qsub -l nodes=6:ppn=2 /path/to/mpi_program
>>>>where "mpi_program" is a file with one line:
>>>> /path/to/mpirun -np 12 /path/to/my_program
>>>Can you please try this jobscript instead:
>>>set | grep PBS
>>>/path/to/mpirun /path/to/my_program
>>>All should be handled by Open MPI automatically. With the "set" bash
>>>command you will get a list with all defined variables for further
>>>analysis; and where you can check for the variables set by Torque.
>>>-- Reuti
>>"set | grep PBS" part had nothing in output.
>Strange - you checked the .o end .e files of the job? - Reuti

There is nothing in -o nor -e output. I had to kill the job.
I checked torque log, it shows (/var/spool/torque/server_logs):

09/29/2008 15:52:16;0100;PBS_Server;Job;;enqueuing into default, state 1 hop 1
09/29/2008 15:52:16;0008;PBS_Server;Job;;Job Queued at request of, owner =, job name =, queue = default
09/29/2008 15:52:16;0040;PBS_Server;Svr;;Scheduler sent command new
09/29/2008 15:52:16;0008;PBS_Server;Job;;Job Modified at request of
09/29/2008 15:52:27;0008;PBS_Server;Job;;Job deleted at request of
09/29/2008 15:52:27;0100;PBS_Server;Job;;dequeuing from default, state EXITING
09/29/2008 15:52:27;0040;PBS_Server;Svr;;Scheduler sent command term
09/29/2008 15:52:47;0001;PBS_Server;Svr;PBS_Server;is_request, bad attempt to connect from (address not trusted - check entry in server_priv/nodes)

where the server_priv/nodes has:
node001 np=4
node002 np=4
node003 np=4
node004 np=4
node005 np=4
node006 np=4
node007 np=4

which was set up by the vender.

What is "address not trusted"?