Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] qsub - mpirun problem
From: Zhiliang Hu (zhu_at_[hidden])
Date: 2008-09-29 17:30:53

At 02:15 PM 9/29/2008 -0700, you wrote:
>It sounds like you may not have setup paswordless ssh between all
>your nodes.
>Doug Reeder

That's not the case. paswordless ssh is set up and it works fine.
-- that's how I can do "mpirun -np 6 -machinefiles ......" fine.


>On Sep 29, 2008, at 2:12 PM, Zhiliang Hu wrote:
>>At 10:45 PM 9/29/2008 +0200, you wrote:
>>>Am 29.09.2008 um 22:33 schrieb Zhiliang Hu:
>>>>At 07:37 PM 9/29/2008 +0200, Reuti wrote:
>>>>>>"-l nodes=6:ppn=2" is all I have to specify the node requests:
>>>>>this might help:
>>>>Essentially the examples given on this web is no difference from
>>>>what I did.
>>>>Only thing new is, I suppose "qsub -I " is for interactive mode.
>>>>When I did this:
>>>> qsub -I -l nodes=7
>>>>It hangs on "qsub: waiting for job to
>>>>>>UNIX_PROMPT> qsub -l nodes=6:ppn=2 /path/to/mpi_program
>>>>>>where "mpi_program" is a file with one line:
>>>>>>/path/to/mpirun -np 12 /path/to/my_program
>>>>>Can you please try this jobscript instead:
>>>>>set | grep PBS
>>>>>/path/to/mpirun /path/to/my_program
>>>>>All should be handled by Open MPI automatically. With the "set"
>>>>>command you will get a list with all defined variables for further
>>>>>analysis; and where you can check for the variables set by Torque.
>>>>>-- Reuti
>>>>"set | grep PBS" part had nothing in output.
>>>Strange - you checked the .o end .e files of the job? - Reuti
>>There is nothing in -o nor -e output. I had to kill the job.
>>I checked torque log, it shows (/var/spool/torque/server_logs):
>>09/29/2008 15:52:16;0100;PBS_Server;Job;;enqueuing
>>into default, state 1 hop 1
>>09/29/2008 15:52:16;0008;PBS_Server;Job;;Job Queued
>>at request of, owner =, job name =
>>, queue = default
>>09/29/2008 15:52:16;0040;PBS_Server;Svr;;Scheduler sent
>>command new
>>09/29/2008 15:52:16;0008;PBS_Server;Job;;Job
>>Modified at request of
>>09/29/2008 15:52:27;0008;PBS_Server;Job;;Job deleted
>>at request of
>>09/29/2008 15:52:27;0100;PBS_Server;Job;;dequeuing
>>from default, state EXITING
>>09/29/2008 15:52:27;0040;PBS_Server;Svr;;Scheduler sent
>>command term
>>09/29/2008 15:52:47;0001;PBS_Server;Svr;PBS_Server;is_request, bad
>>attempt to connect from (address not trusted -
>>check entry in server_priv/nodes)
>>where the server_priv/nodes has:
>>node001 np=4
>>node002 np=4
>>node003 np=4
>>node004 np=4
>>node005 np=4
>>node006 np=4
>>node007 np=4
>>which was set up by the vender.
>>What is "address not trusted"?
>>users mailing list
>users mailing list