Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] SIGTERM propagation across MPI processes
From: Júlio Hoffimann (julio.hoffimann_at_[hidden])
Date: 2012-03-25 16:02:05


That is great news! I also would have voted to remove the bindings.
Boost.MPI is the only library i ever used for MPI in C++, it's a much
better designed object-oriented library, not just bindings. ;-)

With Boost.MPI we can send our own types through MPI messages by
serializing the objects, which is amazing.

A great post about it:
http://daveabrahams.com/2010/09/03/whats-so-cool-about-boost-mpi/
The library: http://www.boost.org/doc/libs/1_49_0/doc/html/mpi.html

Regards,
Júlio.

2012/3/25 Ralph Castain <rhc_at_[hidden]>

> I doubt anything will be done about those warnings, given that the MPI
> Forum has voted to remove the C++ bindings altogether.
>
>
> On Mar 25, 2012, at 12:36 PM, Júlio Hoffimann wrote:
>
> I have no much time now for trying a more recent version, but i'll keep
> that in mind. I also dislike the warnings my current version is giving me (
> http://www.open-mpi.org/community/lists/devel/2011/08/9606.php). I'll see
> how to contact Ubuntu maintainers to update OpenMPI and solve both problems
> in one shot. ;-)
>
> Regards,
> Júlio.
>
> 2012/3/25 Ralph Castain <rhc_at_[hidden]>
>
>>
>> On Mar 25, 2012, at 11:28 AM, Júlio Hoffimann wrote:
>>
>> I wrote the version in a previous P.S. statement: MPI 1.4.3 from Ubuntu
>> 11.10 repositories. :-)
>>
>>
>> Sorry - I see a lot of emails over the day, and forgot. :-/
>>
>> Have you tried this on something more recent, like 1.5.4 or even the
>> developer's trunk? IIRC, there were some issues in the older 1.4 releases,
>> but they have since been fixed.
>>
>>
>> Thanks for the clarifications!
>>
>> 2012/3/25 Ralph Castain <rhc_at_[hidden]>
>>
>>>
>>> On Mar 25, 2012, at 10:57 AM, Júlio Hoffimann wrote:
>>>
>>> I forgot to mention, i tried to set the odls_base_sigkill_timeout as you
>>> told, even 5s was not sufficient for the root execute it's task, and most
>>> important, the kill was instantaneous, there is no 5s hang. My erroneous
>>> conclusion: SIGKILL was being sent instead of SIGTERM.
>>>
>>>
>>> Which version are you using? Could be a bug in there - I can take a look.
>>>
>>>
>>> About the man page, at least for me, the word "kill" is not clear. The
>>> SIGTERM+SIGKILL keywords would be unambiguous.
>>>
>>>
>>> I'll clarify it - thanks!
>>>
>>>
>>> Regards,
>>> Júlio.
>>>
>>> 2012/3/25 Ralph Castain <rhc_at_[hidden]>
>>>
>>>>
>>>> On Mar 25, 2012, at 7:19 AM, Júlio Hoffimann wrote:
>>>>
>>>> Dear Ralph,
>>>>
>>>> Thank you for your prompt reply. I confirmed what you just said by
>>>> reading the mpirun man page at the sections *Signal Propagation* and *Process
>>>> Termination / Signal Handling*.
>>>>
>>>> "During the run of an MPI application, if any rank dies
>>>> abnormally (either exiting before invoking MPI_FINALIZE, or dying as the
>>>> result of a signal), mpirun will print out an error message and kill the
>>>> rest of the MPI application."
>>>>
>>>> If i understood correctly, the SIGKILL signal is sent to every process
>>>> on a premature death.
>>>>
>>>>
>>>> Each process receives a SIGTERM, and then a SIGKILL if it doesn't exit
>>>> within a specified time frame. I told you how to adjust that time period in
>>>> the prior message.
>>>>
>>>> In my point of view, i consider this a bug. If OpenMPI allows handling
>>>> signals such as SIGTERM, the other processes in the communicator should
>>>> also have the opportunity to die prettily. Perhaps i'm missing something?
>>>>
>>>>
>>>> Yes, you are - you do get a SIGTERM first, but you are required to exit
>>>> in a timely fashion. You are not allowed to continue running. This is
>>>> required in order to ensure proper cleanup of the job, per the MPI standard.
>>>>
>>>>
>>>> Supposing the described behaviour in the last paragraph, i think would
>>>> be great to explicitly mention the SIGKILL in the man page, or even better,
>>>> fix the implementation to send SIGTERM instead, making possible for the
>>>> user cleanup all processes before exit.
>>>>
>>>>
>>>> We already do, as described above.
>>>>
>>>>
>>>> I solved my particular problem by adding another flag *
>>>> unexpected_error_on_slave*:
>>>>
>>>> volatile sig_atomic_t unexpected_error_occurred = 0;int unexpected_error_on_slave = 0;enum tag { work_tag, die_tag }
>>>> void my_handler( int sig ){
>>>> unexpected_error_occurred = 1;}
>>>> //// somewhere in the code...//
>>>> signal(SIGTERM, my_handler);
>>>> if (root process) {
>>>>
>>>> // do stuff
>>>>
>>>> world.recv(mpi::any_source, die_tag, unexpected_error_on_slave);
>>>> if ( unexpected_error_occurred || unexpected_error_on_slave ) {
>>>>
>>>> // save something
>>>>
>>>> world.abort(SIGABRT);
>>>> }}else { // slave process
>>>>
>>>> // do different stuff
>>>>
>>>> if ( unexpected_error_occurred ) {
>>>>
>>>> // just communicate the problem to the root
>>>> world.send(root,die_tag,1);
>>>> signal(SIGTERM,SIG_DFL);
>>>> while(true)
>>>> ; // wait, master will take care of this
>>>> }
>>>> world.send(root,die_tag,0); // everything is fine}
>>>> signal(SIGTERM, SIG_DFL); // reassign default handler
>>>> // continues the code...
>>>>
>>>>
>>>> Note the slave must hang for the store operation get executed at the
>>>> root, otherwise we back for the previous scenario. It's theoretically
>>>> unnecessary send MPI messages to accomplish the desired cleanup, and in
>>>> more complex applications this can turn into a nightmare. As we know,
>>>> asynchronous events are insane to debug.
>>>>
>>>> Best regards,
>>>> Júlio.
>>>>
>>>> P.S.: MPI 1.4.3 from Ubuntu 11.10 repositories.
>>>>
>>>> 2012/3/23 Ralph Castain <rhc_at_[hidden]>
>>>>
>>>>> Well, yes and no. When a process abnormally terminates, OMPI will kill
>>>>> the job - this is done by first hitting each process with a SIGTERM,
>>>>> followed shortly thereafter by a SIGKILL. So you do have a short time on
>>>>> each process to attempt to cleanup.
>>>>>
>>>>> My guess is that your signal handler actually is getting called, but
>>>>> we then kill the process before you can detect that it was called.
>>>>>
>>>>> You might try adjusting the time between sigterm and sigkill using
>>>>> the odls_base_sigkill_timeout MCA param:
>>>>>
>>>>> mpirun -mca odls_base_sigkill_timeout N
>>>>>
>>>>> should cause it to wait for N seconds before issuing the sigkill. Not
>>>>> sure if that will help or not - it used to work for me, but I haven't tried
>>>>> it for awhile. What versions of OMPI are you using?
>>>>>
>>>>>
>>>>> On Mar 22, 2012, at 4:49 PM, Júlio Hoffimann wrote:
>>>>>
>>>>> Dear all,
>>>>>
>>>>> I'm trying to handle signals inside a MPI task farming model.
>>>>> Following is a pseudo-code of what i'm trying to achieve:
>>>>>
>>>>> volatile sig_atomic_t unexpected_error_occurred = 0;
>>>>> void my_handler( int sig ){
>>>>> unexpected_error_occurred = 1;}
>>>>> //// somewhere in the code...//
>>>>> signal(SIGTERM, my_handler);
>>>>> if (root process) {
>>>>>
>>>>> // do stuff
>>>>>
>>>>> if ( unexpected_error_occurred ) {
>>>>>
>>>>> // save something
>>>>>
>>>>> // reraise the SIGTERM again, but now with the default handler
>>>>> signal(SIGTERM, SIG_DFL);
>>>>> raise(SIGTERM);
>>>>> }}else { // slave process
>>>>>
>>>>> // do different stuff
>>>>>
>>>>> if ( unexpected_error_occurred ) {
>>>>>
>>>>> // just propragate the signal to the root
>>>>> signal(SIGTERM, SIG_DFL);
>>>>> raise(SIGTERM);
>>>>> }}
>>>>> signal(SIGTERM, SIG_DFL); // reassign default handler
>>>>> // continues the code...
>>>>>
>>>>>
>>>>> As can be seen, the signal handling is required for implementing a
>>>>> restart feature. All the problem resides in the assumption i made that all
>>>>> processes in the communicator will receive a SIGTERM as a side effect. Is
>>>>> it a valid assumption? How the actual MPI implementation deals with such
>>>>> scenarios?
>>>>>
>>>>> I also tried to replace all the raise() calls by MPI_Abort(), which
>>>>> according to the documentation (
>>>>> http://www.open-mpi.org/doc/v1.5/man3/MPI_Abort.3.php), sends a
>>>>> SIGTERM to all associated processes. The undesired behaviour persists: when
>>>>> killing a slave process, the save section in the root branch is not
>>>>> executed.
>>>>>
>>>>> Appreciate any help,
>>>>> Júlio.
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>