sorry for incomplete description. will trace problem more closely later
next week and provide.
On Mon, Jun 23, 2014 at 10:13 PM, Jeff Squyres (jsquyres) <
> Ok, just got in to Chicago from my flight and am back online.
> Mike: you are still not providing very much information. :-\
> Your first mails make it seem like MTT is continuing to run, but leaving
> "launchers" (assumedly mpirun processes) still running, but they have no
> children. Which would be very weird for mpirun to do, if it has no
> children left. This could be both an MTT and an ORTE bug, in this case.
> But your last mail seems to imply that MTT is hanging indefinitely.
> Can you please provide a clear, precise description of what is happening?
> FWIW: Yes, we are killing the parent first now, to give mpirun a chance to
> cleanup / tell remote orteds to die / kill children processes / etc.
> Killing the children first both doesn't test the common case of how people
> kill MPI processes (i.e., they kill mpirun), and it also doesn't allow
> mpirun to tell remote processes to die.
> Do you run with --verbose output? MTT should output messages like "***
> Killing mpirun with SIGTERM", and the like. Do you see timeout messages at
> all? I.e., is MTT not entering the timeout code at all?
> On Jun 23, 2014, at 12:16 PM, Dave Goodell (dgoodell) <dgoodell_at_[hidden]>
> > On Jun 23, 2014, at 8:48 AM, Mike Dubman <miked_at_[hidden]>
> >> btw, i think now, when parent process is killed before child, OS makes
> child as "<defunct>" which stick around for good.
> > The grandparent should inherit the child. If the grandparent then does
> not wait(2) on the child, then the child will remain a zombie / defunct.
> So in our specific case, this behavior will depend on what the parent
> process of mpirun is and whether it is waiting on child processes
> > -Dave
> > _______________________________________________
> > mtt-devel mailing list
> > mtt-devel_at_[hidden]
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel
> > Link to this post:
> Jeff Squyres
> For corporate legal information go to:
> mtt-devel mailing list
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel
> Link to this post: