Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] qsub - mpirun problem
From: Zhiliang Hu (zhu_at_[hidden])
Date: 2008-09-29 18:30:18


At 12:10 AM 9/30/2008 +0200, you wrote:

>>>>>>Can you please try this jobscript instead:
>>>>>>
>>>>>>#!/bin/sh
>>>>>>set | grep PBS
>>>>>>/path/to/mpirun /path/to/my_program
>>>>>>
>>>>>>All should be handled by Open MPI automatically. With the "set"
>>>>>>bash
>>>>>>command you will get a list with all defined variables for further
>>>>>>analysis; and where you can check for the variables set by Torque.
>>>>>>
>>>>>>-- Reuti
>>>>>
>>>>>"set | grep PBS" part had nothing in output.
>>>>
>>>>Strange - you checked the .o end .e files of the job? - Reuti
>>>
>>>There is nothing in -o nor -e output. I had to kill the job.
>>>I checked torque log, it shows (/var/spool/torque/server_logs):
>>>
>>>09/29/2008 15:52:16;0100;PBS_Server;Job;799.xxx.xxx.xxx;enqueuing
>>>into default, state 1 hop 1
>>>09/29/2008 15:52:16;0008;PBS_Server;Job;799.xxx.xxx.xxx;Job Queued
>>>at request of zhu_at_xxx.xxx.xxx, owner = zhu_at_xxx.xxx.xxx, job name =
>>>mpiblastn.sh, queue = default
>>>09/29/2008 15:52:16;0040;PBS_Server;Svr;xxx.xxx.xxx;Scheduler sent
>>>command new
>>>09/29/2008 15:52:16;0008;PBS_Server;Job;799.xxx.xxx.xxx;Job
>>>Modified at request of Scheduler_at_xxx.xxx.xxx
>>>09/29/2008 15:52:27;0008;PBS_Server;Job;799.xxx.xxx.xxx;Job
>>>deleted at request of zhu_at_xxx.xxx.xxx
>>>09/29/2008 15:52:27;0100;PBS_Server;Job;799.xxx.xxx.xxx;dequeuing
>>>from default, state EXITING
>>>09/29/2008 15:52:27;0040;PBS_Server;Svr;xxx.xxx.xxx;Scheduler sent
>>>command term
>>>09/29/2008 15:52:47;0001;PBS_Server;Svr;PBS_Server;is_request, bad
>>>attempt to connect from 172.16.100.1:1021 (address not trusted -
>>>check entry in server_priv/nodes)
>
>As you blank out some addresses: have the nodes and the headnode one
>or two network cards installed? All the names like node001 et al. are
>known on neach node by the correct address? I.e. 172.16.100.1 = node001?
>
>-- Reuti

There should be no problem in this regard -- the set up is by a
commercial company. I can ssh from any node to any node (passwdless).

Zhiliang