From: Ethan Mallove (ethan.mallove_at_[hidden])
Date: 2006-10-31 09:39:49


I've run with these changes and they seem to work (I did
need to change the INI param "module" to "specify_module",
from the previous commit). Just one question (see below).

On Sun, Oct/29/2006 08:36:04AM, jsquyres_at_[hidden] wrote:
> Author: jsquyres
> Date: 2006-10-29 08:35:58 EST (Sun, 29 Oct 2006)
> New Revision: 403
>
> Modified:
> trunk/CHANGES
> trunk/lib/MTT/DoCommand.pm
> trunk/lib/MTT/Globals.pm
> trunk/samples/ompi-core-template.ini
>
> Log:
> * Add textwrap to Global defaults
> * Add new global: drain_timeout
> * In DoCommand, after the timeout, we'll wait drain_timeout more
> seconds to get any final output and then unconditionally move on.
> * Add some Verbose statements to catch when kill() does not seem to
> be working. Have not nailed this down yet; want to see some output
> from when it occurrs.
>
>
> Modified: trunk/CHANGES
> ==============================================================================
> --- trunk/CHANGES (original)
> +++ trunk/CHANGES 2006-10-29 08:35:58 EST (Sun, 29 Oct 2006)
> @@ -1,2 +1,5 @@
> To announce to OMPI core testers:
>
> +- added new fields to MTT section to ini file
> + - textwrap
> + - drain_timeout
>
> Modified: trunk/lib/MTT/DoCommand.pm
> ==============================================================================
> --- trunk/lib/MTT/DoCommand.pm (original)
> +++ trunk/lib/MTT/DoCommand.pm 2006-10-29 08:35:58 EST (Sun, 29 Oct 2006)
> @@ -32,6 +32,7 @@
> if ($kid != 0) {
> return $?;
> }
> + Verbose("** Kill TERM didn't work!\n");
>
> # Nope, that didn't work. Sleep a few seconds and try again.
> sleep(2);
> @@ -39,6 +40,7 @@
> if ($kid != 0) {
> return $?;
> }
> + Verbose("** Kill TERM (more waiting) didn't work!\n");
>
> # That didn't work either. Try SIGINT;
> kill("INT", $pid);
> @@ -46,6 +48,7 @@
> if ($kid != 0) {
> return $?;
> }
> + Verbose("** Kill INT didn't work!\n");
>
> # Nope, that didn't work. Sleep a few seconds and try again.
> sleep(2);
> @@ -53,6 +56,7 @@
> if ($kid != 0) {
> return $?;
> }
> + Verbose("** Kill INT (more waiting) didn't work!\n");
>
> # Ok, now we're mad. Be violent.
> while (1) {
> @@ -61,13 +65,7 @@
> if ($kid != 0) {
> return $?;
> }
> - sleep(1);
> -
> - kill("KILL", $pid);
> - $kid = waitpid($pid, WNOHANG);
> - if ($kid != 0) {
> - return $?;
> - }
> + Verbose("** Kill KILL didn't work!\n");
> sleep(1);
> }
> }
> @@ -278,7 +276,7 @@
> if (defined($end_time) && time() > $end_time) {
> my $over = time() - $end_time;
> if ($over > $last_over) {
> - Debug("*** Past timeout by $over seconds\n");
> + Verbose("*** Past timeout by $over seconds\n");
> my $st = _kill_proc($pid);
> if (!defined($killed_status)) {
> $killed_status = $st;
> @@ -286,6 +284,12 @@
> $ret->{timed_out} = 1;
> }
> $last_over = $over;
> +
> + # See if we've over the drain_timeout
> + if ($over > $MTT::Globals::Values->{drain_timeout}) {
> + Verbose("*** Past drain timeout; quitting\n");
> + $done = 0;
> + }

I would have thought if we're "quitting" here, then $done =
1.

-Ethan

> }
> }
> close OUTerr;
>
> Modified: trunk/lib/MTT/Globals.pm
> ==============================================================================
> --- trunk/lib/MTT/Globals.pm (original)
> +++ trunk/lib/MTT/Globals.pm 2006-10-29 08:35:58 EST (Sun, 29 Oct 2006)
> @@ -26,6 +26,8 @@
> hostfile => undef,
> hostlist => undef,
> max_np => undef,
> + textwrap => 76,
> + drain_timeout => 5,
> };
>
> # Reset $Globals per a specific ini file
> @@ -68,6 +70,13 @@
> if ($val) {
> $Values->{textwrap} = $val;
> }
> +
> + # Output display preference
> +
> + my $val = MTT::Values::Value($ini, "MTT", "drain_timeout");
> + if ($val) {
> + $Values->{drain_timeout} = $val;
> + }
> }
>
>
>
> Modified: trunk/samples/ompi-core-template.ini
> ==============================================================================
> --- trunk/samples/ompi-core-template.ini (original)
> +++ trunk/samples/ompi-core-template.ini 2006-10-29 08:35:58 EST (Sun, 29 Oct 2006)
> @@ -91,9 +91,15 @@
> # returned by &env_max_procs(), you can fill in an integer here.
> max_np =
>
> -# Output display preference
> +# OMPI Core: Output display preference; the default width at which MTT
> +# output will wrap.
> textwrap = 76
>
> +# OMPI Core: After the timeout for a command has passed, wait this
> +# many additional seconds to drain all output, and then kill it with
> +# extreme prejiduce.
> +drain_timeout = 5
> +
> #======================================================================
> # MPI get phase
> #======================================================================
> _______________________________________________
> mtt-svn mailing list
> mtt-svn_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-svn