Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Reuti (reuti_at_[hidden])
Date: 2007-07-23 17:39:32


running conventional TCP/IP all is safe AFAICS - all processes will
be killed on all involved nodes. The problem arises with OFED, with
which we also have this behavior using MVAPICH.

Unfortunately we have only a limited number of nodes with InfiniBand,
and hence time to test and develop something is highly limited, as
users running applications there are in favor.

Am 23.07.2007 um 21:29 schrieb Pak Lui:

> Hi Henk,
> SLIM H.A. wrote:
>> Dear Pak Lui
>> I can delete the (sge) job with qdel -f such that it disappears
>> from the
>> job list but the application processes keep running, including the
>> shepherds. I have to kill them with -15
>> For some reason the kill -15 does not reach mpirun. (We use such a
>> parameter to mpirun on our myrinet mx nodes with mpich, that's why I
>> asked).
> I believe qdel would send a SIGKILL to mpirun

Correct, it's send to the complete process group which qrsh-starter
spawns up. I.e. "kill -9 -- -processgroup_id".

> instead of a SIGTERM
> (-15), that is why you don't see the signal reaches mpirun. Since
> there
> is no way to catch a SIGKILL so that maybe why the orted and the
> processes would keep running.

In a Tightly Integrated parallel environment, there shouldn't be any
need to catch such a signal. SGE will kill all started processes on
its own - no further action necessary.

> Hmm, this actually reminds me of a related problem. That is with the
> qsub -notify option does not work as it intended under ORTE. The qsub
> -notify option supposed to send a SIGUSR2 to mpirun and the processes
> for an impending SIGKILL N seconds before it actually happens.
> However,
> we don't catch SIGUSR2 signal in ORTE specifically for SGE (or the
> gridengine modules), therefore user would see the mpirun and orted
> exit
> before the user apps can catch the SIGUSR signal. I should file a trac
> bug against this SGE feature we don't yet support and fix it
> sometime in
> the future.

As SIGUSR2 is send to the complete processgroup (and keep in mind:
also the job script on its own), it would just mean to ignore
SIGUSR1/2 in orted (and maybe in mpirun, otherwise it also must be
trapped there). So it could be included in the action to the --no-
daemonize option given to orted when running under SGE. For now you
would also need this in the job script:

trap '' usr2
export PATH=/home/reuti/openmpi-1.2.3/bin:$PATH
(trap '' usr2; exec mpirun -np $NSLOTS /home/reuti/mpihello)

-- Reuti

> So back to your problem. Although this is unintended, maybe you can
> try
> to run the job with qsub -notify for the mean time until we change for
> above, since it will send a SIGUSR2 to mpirun, which should terminate
> the mpirun, orted and the user processes in a way that is more
> gracefully than qdel (or SIGKILL), because SIGKILL would not allow
> orted
> to kill off the user processes, as SIGTERM or SIGUSR1/2 would.
>> Just to confirm, there is no configure directive specific to
>> gridengine
>> when building openmpi?
> Right, there isn't any configure directives currently.
>> Thanks
>> henk
>>> -----Original Message-----
>>> From: users-bounces_at_[hidden]
>>> [mailto:users-bounces_at_[hidden]] On Behalf Of Pak Lui
>>> Sent: 23 July 2007 15:16
>>> To: Open MPI Users
>>> Subject: Re: [OMPI users] sge qdel fails
>>> Hi Henk,
>>> The sge script should not require any extra parameter. The
>>> qdel command should send the kill signal to mpirun and also
>>> remove the SGE allocated tmp directory (in something like
>>> /tmp/174.1.all.q/) which contains the OMPI session dir for
>>> the running job, and in turns would cause orted and the user
>>> processes to exit.
>>> Maybe you could try qdel -f <jid> to force delete from the
>>> sge_qmaster, in case when sge_execd does not respond to the
>>> delete request by the sge_qmaster?
>>> SLIM H.A. wrote:
>>>> I am using OpenMPI 1.2.3 with SGE 6.0u7 over InfiniBand (OFED 1.2),
>>>> following the recommendation in the OpenMPI FAQ
>>>> The job runs but when the user wants to delete the job with
>>> the qdel
>>>> command, this fails. Does the mpirun command
>>>> mpirun -np $NSLOTS ./exe
>>>> in the sge script require extra parameters?
>>>> Thanks for any advice
>>>> Henk
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>> --
>>> - Pak Lui
>>> pak.lui_at_[hidden]
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
> --
> - Pak Lui
> pak.lui_at_[hidden]
> _______________________________________________
> users mailing list
> users_at_[hidden]