Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Parent terminates when child crashes/terminates (without finalizing)
From: N.M. Maclaren (nmm1_at_[hidden])
Date: 2010-12-18 11:06:44


On Dec 18 2010, Ken Lloyd wrote:
>
>Yes, this is a hard problem. It is not endemic to OpenMPI, however.
>This hints at the distributed memory/process/thread issues either
>through the various OSs or alternately external to them in many solution
>spaces.

Absolutely. I hope that I never implied anything different. I found
it hard enough to write a reliable nohup that would spawn a process
that would outlast the termination of its caller! I had to add a new
hack for every new version of Unix and that I used it under.

Luckily, I didn't need it to work under the batch schedulers or MPI,
so I could ignore those issues. My problems THERE were the converse:
cleaning up all of the child processes without killing too many
extraneous processes.

>Jeff Squyers statement that "flexible dynamic processing is not
>something many people would ask for" is troubling. Do pthreads provide
>such a great solution strategy to these problems?

Clearly a rhetorical question - but, for anyone who don't know: no way,
Jose :-( There isn't even a reliable kill that doesn't rely on a handshake
with the child.

>In other words, if we were to offer a true "flexible dynamic
>processing" (which I personally would advocate), would they (the
>developers and users) come?

Jeff says not. That may be because there is little requirement, it may
be because the people who have tried have got their fingers burnt, or it
may be because they have consulted some cynical old sod like me who told
them it was just asking for trouble.

I would have major problems just SPECIFYING it. MPI 2.2 is fine as far
as it goes, but an implementation would have to decide what to do about
all of the inheritable state, from file descriptors through signal handling
through shared memory segments through 'capabilities' through .... No
matter what decision is taken, it will be wrong for some people.

Regards,
Nick Maclaren.