Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Ralph Castain (rhc_at_[hidden])
Date: 2007-03-12 16:29:01


On 3/12/07 2:18 PM, "Reuti" <reuti_at_[hidden]> wrote:

> Am 12.03.2007 um 20:36 schrieb Ralph Castain:
>
>> ORTE propagates the signal to the application processes, but the ORTE
>> daemons never actually look at the signal themselves (looks just
>> like a
>> message to them). So I'm a little puzzled by that error message
>> about the
>> "daemon received signal 12" - I suspect that's just a misleading
>> message
>> that was supposed to indicate that a daemon was given a signal to
>> pass on.
>>
>> Just to clarify: the daemons are moved out of your initial process
>> group to
>
> Is this still the case also in SGE mode? It was the reason why I
> never wrote a Howto for a Tight Integration under SGE. Instead I
> looked forward for the final 1.2 with full SGE support.

Well, that's a good question - the daemons explicitly do re-set their
process group, so unless SGE prevents that somehow (or does something so
that it doesn't separate the daemon from the console signals), then it
should still be true.

>
> And: this might be odd under SGE. I must admit, that I didn't have
> had the time up to play with OpenMPI 1.2-beta for the Tight
> Integration, but it sounds to me like (under Linux) the orte-daemons
> could survive although the job was already killed (by processgroup),
> as the final stop/kill can't be caught and forwarded.
>
> I'll check this ASAP with 1.2-beta. I have only access to Linux
> clusters.

We haven't seen a problem, though that doesn't mean it can't exist. Mpirun
traps stop/kill specifically for that reason, so I'm not sure why it
wouldn't work. Let me know what you find out.

>
> But now we are going beyond Mark's initial problem.
>
> -- Reuti
>
>
>> avoid seeing any signals from your terminal. When you issue a
>> signal, mpirun
>> picks it up and forwards it to your application processes via the ORTE
>> daemons - the ORTE daemons, however, do *not* look at that signal
>> but just
>> pass it along.
>>
>> As for timing, all we do is pass STOP to the OpenMPI application
>> process -
>> it's up to the local system as to what happens when a "kill -STOP" is
>> issued. It was always my impression that the system stopped process
>> execution immediately under that signal, but with some allowance
>> for the old
>> kernel vs user space issue.
>>
>> Once all the processes have terminated, mpirun tells the daemons to
>> go ahead
>> and exit. That's the only way the daemons get terminated in this
>> procedure.
>>
>> Can you tell us something about your system? Is this running under
>> Linux,
>> what kind of OS, how was OpenMPI configured, etc?
>>
>> Thanks
>> Ralph
>>
>>
>>
>> On 3/12/07 1:26 PM, "Reuti" <reuti_at_[hidden]> wrote:
>>
>>> Am 12.03.2007 um 19:55 schrieb Ralph Castain:
>>>
>>>> I'll have to look into it - I suspect this is simply an erroneous
>>>> message
>>>> and that no daemon is actually being started.
>>>>
>>>> I'm not entirely sure I understand what's happening, though, in
>>>> your code.
>>>> Are you saying that mpirun starts some number of application
>>>> processes which
>>>> run merrily along, and then qsub sends out USR1/2 signals followed
>>>> by STOP
>>>> and then KILL in an effort to abort the job? So the application
>>>> processes
>>>> don't normally terminate, but instead are killed via these signals?
>>>
>>> If you specify -notify in SGE with the qsub, then jobs are warned by
>>> the sge_shepered (parent if the job) during execution, so that they
>>> could perfom some proper shutdown action, before they are really
>>> stopped/killed:
>>>
>>> for suspend: USR1 -wait-defined-time- STOP
>>> for kill: USR2 -wait-defined-time- KILL
>>>
>>> Worth to be noted: the signals are sent to the complete processgroup
>>> of the job created by the jobscript and mpirun, but not to each
>>> daemon which is created by the internal qrsh on any of the slave
>>> nodes! This should be orte's duty.
>>>
>>> Question is also: are OpenMPI jobs surviving a STOP for some time at
>>> all, or will there be timing issues due to communication timeouts?
>>>
>>> HTH - Reuti
>>>
>>>
>>>>
>>>> Just want to ensure I understand the scenario here as that is
>>>> something
>>>> obviously unique to GE.
>>>>
>>>> Thanks
>>>> Ralph
>>>>
>>>>
>>>> On 3/12/07 9:42 AM, "Olesen, Mark" <Mark.Olesen_at_[hidden]>
>>>> wrote:
>>>>
>>>>> I'm testing openmpi 1.2rc1 with GridEngine 6.0u9 and ran into
>>>>> interesting
>>>>> behaviour when using the qsub -notify option.
>>>>> With -notify, USR1 and USR2 are sent X seconds before sending STOP
>>>>> and KILL
>>>>> signals, respectively.
>>>>>
>>>>> When the USR2 signal is sent to the process group with the mpirun
>>>>> process, I
>>>>> receive an error message about not being able to start a daemon:
>>>>>
>>>>> mpirun: Forwarding signal 12 to job[dealc12:18212] ERROR: A daemon
>>>>> on node
>>>>> dealc12 failed to start as expected.
>>>>> [dealc12:18212] ERROR: There may be more information available from
>>>>> [dealc12:18212] ERROR: the 'qstat -t' command on the Grid Engine
>>>>> tasks.
>>>>> [dealc12:18212] ERROR: If the problem persists, please restart the
>>>>> [dealc12:18212] ERROR: Grid Engine PE job
>>>>> [dealc12:18212] The daemon received a signal 12.
>>>>> [dealc12:18212] ERROR: A daemon on node dealc20 failed to start as
>>>>> expected.
>>>>> [dealc12:18212] ERROR: There may be more information available from
>>>>> [dealc12:18212] ERROR: the 'qstat -t' command on the Grid Engine
>>>>> tasks.
>>>>> [dealc12:18212] ERROR: If the problem persists, please restart the
>>>>> [dealc12:18212] ERROR: Grid Engine PE job
>>>>> [dealc12:18212] The daemon received a signal 12.
>>>>>
>>>>> The job eventually stops, but the mpirun process itself continues
>>>>> to live
>>>>> (just the ppid changes).
>>>>>
>>>>> According to orte(1)/Signal Propagation, USR1 and USR2 should be
>>>>> propagated
>>>>> to all processes in the job (which seems to be happening), but why
>>>>> is a
>>>>> daemon start being attempted and the mpirun not being stopped?
>>>>>
>>>>> /mark
>>>>>
>>>>> This e-mail message and any attachments may contain legally
>>>>> privileged,
>>>>> confidential or proprietary Information, or information otherwise
>>>>> protected by
>>>>> law of ArvinMeritor, Inc., its affiliates, or third parties. This
>>>>> notice
>>>>> serves as marking of its „Confidential‰ status as defined in any
>>>>> confidentiality agreements concerning the sender and recipient. If
>>>>> you are not
>>>>> the intended recipient(s), or the employee or agent responsible
>>>>> for delivery
>>>>> of this message to the intended recipient(s), you are hereby
>>>>> notified that any
>>>>> dissemination, distribution or copying of this e-mail message is
>>>>> strictly
>>>>> prohibited. If you have received this message in error, please
>>>>> immediately
>>>>> notify the sender and delete this e-mail message from your
>>>>> computer.
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users