Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Reuti (reuti_at_[hidden])
Date: 2007-03-13 07:50:52


Am 12.03.2007 um 21:29 schrieb Ralph Castain:

> On 3/12/07 2:18 PM, "Reuti" <reuti_at_[hidden]> wrote:
>
>> Am 12.03.2007 um 20:36 schrieb Ralph Castain:
>>
>>> ORTE propagates the signal to the application processes, but the
>>> ORTE
>>> daemons never actually look at the signal themselves (looks just
>>> like a
>>> message to them). So I'm a little puzzled by that error message
>>> about the
>>> "daemon received signal 12" - I suspect that's just a misleading
>>> message
>>> that was supposed to indicate that a daemon was given a signal to
>>> pass on.
>>>
>>> Just to clarify: the daemons are moved out of your initial process
>>> group to
>>
>> Is this still the case also in SGE mode? It was the reason why I
>> never wrote a Howto for a Tight Integration under SGE. Instead I
>> looked forward for the final 1.2 with full SGE support.
>
> Well, that's a good question - the daemons explicitly do re-set their
> process group, so unless SGE prevents that somehow (or does
> something so
> that it doesn't separate the daemon from the console signals), then it
> should still be true.

I don't see this:

9858 19137 9858 \_ sge_shepherd-45248 -bg
9860 9858 9860 \_ /usr/sge/utilbin/lx24-x86/rshd -l
9863 9860 9863 \_ /usr/sge/utilbin/lx24-x86/qrsh_starter /
var/spool/sge/node44/active_jobs/45248.1/1.node44 noshell
9864 9863 9864 \_ /home/reuti/local/openmpi-1.2rc3/bin/
orted --no-daemonize --bootproxy 1 --name 0.0.1 --num_procs 5 --
vpid_start 0
9865 9864 9864 \_ /home/reuti/mpihello

This looks as I would expect it from a Tight Integration. Nothing to
complain.

>> And: this might be odd under SGE. I must admit, that I didn't have
>> had the time up to play with OpenMPI 1.2-beta for the Tight
>> Integration, but it sounds to me like (under Linux) the orte-daemons
>> could survive although the job was already killed (by processgroup),
>> as the final stop/kill can't be caught and forwarded.
>>
>> I'll check this ASAP with 1.2-beta. I have only access to Linux
>> clusters.
>
> We haven't seen a problem, though that doesn't mean it can't exist.
> Mpirun
> traps stop/kill specifically for that reason, so I'm not sure why it
> wouldn't work. Let me know what you find out.

By default sigstop and sigkill can't be trapped (unless you do some
nasty tricks I heard - then it might be possible again).

-- Reuti

>>
>> But now we are going beyond Mark's initial problem.
>>
>> -- Reuti
>>
>>
>>> avoid seeing any signals from your terminal. When you issue a
>>> signal, mpirun
>>> picks it up and forwards it to your application processes via the
>>> ORTE
>>> daemons - the ORTE daemons, however, do *not* look at that signal
>>> but just
>>> pass it along.
>>>
>>> As for timing, all we do is pass STOP to the OpenMPI application
>>> process -
>>> it's up to the local system as to what happens when a "kill -
>>> STOP" is
>>> issued. It was always my impression that the system stopped process
>>> execution immediately under that signal, but with some allowance
>>> for the old
>>> kernel vs user space issue.
>>>
>>> Once all the processes have terminated, mpirun tells the daemons to
>>> go ahead
>>> and exit. That's the only way the daemons get terminated in this
>>> procedure.
>>>
>>> Can you tell us something about your system? Is this running under
>>> Linux,
>>> what kind of OS, how was OpenMPI configured, etc?
>>>
>>> Thanks
>>> Ralph
>>>
>>>
>>>
>>> On 3/12/07 1:26 PM, "Reuti" <reuti_at_[hidden]> wrote:
>>>
>>>> Am 12.03.2007 um 19:55 schrieb Ralph Castain:
>>>>
>>>>> I'll have to look into it - I suspect this is simply an erroneous
>>>>> message
>>>>> and that no daemon is actually being started.
>>>>>
>>>>> I'm not entirely sure I understand what's happening, though, in
>>>>> your code.
>>>>> Are you saying that mpirun starts some number of application
>>>>> processes which
>>>>> run merrily along, and then qsub sends out USR1/2 signals followed
>>>>> by STOP
>>>>> and then KILL in an effort to abort the job? So the application
>>>>> processes
>>>>> don't normally terminate, but instead are killed via these
>>>>> signals?
>>>>
>>>> If you specify -notify in SGE with the qsub, then jobs are
>>>> warned by
>>>> the sge_shepered (parent if the job) during execution, so that they
>>>> could perfom some proper shutdown action, before they are really
>>>> stopped/killed:
>>>>
>>>> for suspend: USR1 -wait-defined-time- STOP
>>>> for kill: USR2 -wait-defined-time- KILL
>>>>
>>>> Worth to be noted: the signals are sent to the complete
>>>> processgroup
>>>> of the job created by the jobscript and mpirun, but not to each
>>>> daemon which is created by the internal qrsh on any of the slave
>>>> nodes! This should be orte's duty.
>>>>
>>>> Question is also: are OpenMPI jobs surviving a STOP for some
>>>> time at
>>>> all, or will there be timing issues due to communication timeouts?
>>>>
>>>> HTH - Reuti
>>>>
>>>>
>>>>>
>>>>> Just want to ensure I understand the scenario here as that is
>>>>> something
>>>>> obviously unique to GE.
>>>>>
>>>>> Thanks
>>>>> Ralph
>>>>>
>>>>>
>>>>> On 3/12/07 9:42 AM, "Olesen, Mark" <Mark.Olesen_at_[hidden]>
>>>>> wrote:
>>>>>
>>>>>> I'm testing openmpi 1.2rc1 with GridEngine 6.0u9 and ran into
>>>>>> interesting
>>>>>> behaviour when using the qsub -notify option.
>>>>>> With -notify, USR1 and USR2 are sent X seconds before sending
>>>>>> STOP
>>>>>> and KILL
>>>>>> signals, respectively.
>>>>>>
>>>>>> When the USR2 signal is sent to the process group with the mpirun
>>>>>> process, I
>>>>>> receive an error message about not being able to start a daemon:
>>>>>>
>>>>>> mpirun: Forwarding signal 12 to job[dealc12:18212] ERROR: A
>>>>>> daemon
>>>>>> on node
>>>>>> dealc12 failed to start as expected.
>>>>>> [dealc12:18212] ERROR: There may be more information available
>>>>>> from
>>>>>> [dealc12:18212] ERROR: the 'qstat -t' command on the Grid Engine
>>>>>> tasks.
>>>>>> [dealc12:18212] ERROR: If the problem persists, please restart
>>>>>> the
>>>>>> [dealc12:18212] ERROR: Grid Engine PE job
>>>>>> [dealc12:18212] The daemon received a signal 12.
>>>>>> [dealc12:18212] ERROR: A daemon on node dealc20 failed to
>>>>>> start as
>>>>>> expected.
>>>>>> [dealc12:18212] ERROR: There may be more information available
>>>>>> from
>>>>>> [dealc12:18212] ERROR: the 'qstat -t' command on the Grid Engine
>>>>>> tasks.
>>>>>> [dealc12:18212] ERROR: If the problem persists, please restart
>>>>>> the
>>>>>> [dealc12:18212] ERROR: Grid Engine PE job
>>>>>> [dealc12:18212] The daemon received a signal 12.
>>>>>>
>>>>>> The job eventually stops, but the mpirun process itself continues
>>>>>> to live
>>>>>> (just the ppid changes).
>>>>>>
>>>>>> According to orte(1)/Signal Propagation, USR1 and USR2 should be
>>>>>> propagated
>>>>>> to all processes in the job (which seems to be happening), but
>>>>>> why
>>>>>> is a
>>>>>> daemon start being attempted and the mpirun not being stopped?
>>>>>>
>>>>>> /mark
>>>>>>
>>>>>> This e-mail message and any attachments may contain legally
>>>>>> privileged,
>>>>>> confidential or proprietary Information, or information otherwise
>>>>>> protected by
>>>>>> law of ArvinMeritor, Inc., its affiliates, or third parties. This
>>>>>> notice
>>>>>> serves as marking of its „Confidential‰ status as defined in any
>>>>>> confidentiality agreements concerning the sender and
>>>>>> recipient. If
>>>>>> you are not
>>>>>> the intended recipient(s), or the employee or agent responsible
>>>>>> for delivery
>>>>>> of this message to the intended recipient(s), you are hereby
>>>>>> notified that any
>>>>>> dissemination, distribution or copying of this e-mail message is
>>>>>> strictly
>>>>>> prohibited. If you have received this message in error, please
>>>>>> immediately
>>>>>> notify the sender and delete this e-mail message from your
>>>>>> computer.
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> users_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users