Open MPI logo

MTT Devel Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all MTT Devel mailing list

Subject: Re: [MTT devel] fix zombie commit
From: Jeff Squyres (jsquyres) (jsquyres_at_[hidden])
Date: 2013-02-25 11:24:28


On Feb 24, 2013, at 6:59 AM, Mike Dubman <miked_at_[hidden]> wrote:

> What protection do you mean? Check that /proc/pid/status exists? It is done in Grep()

Ah, excellent -- I hadn't noticed that.

> We observe that process which was launched by mtt and hangs (mtt detect timeout and starts do_command procedure), later enters into "defunct" state.

Looking at the code, you're checking for zombie status before MTT kills the proc. Am I reading that right?

If so, then it could well be that the process has exited but not yet been reaped (because _kill_proc() hasn't been invoked yet). If this is the case, is the real cause of the problem that the OUTread and ERRread aren't being closed when the child process exits, and therefore we keep looping looking for new output from them?

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/