Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Parent terminates when child crashes/terminates (without finalizing)
From: N.M. Maclaren (nmm1_at_[hidden])
Date: 2010-12-18 07:15:14


On Dec 17 2010, Jeff Squyres wrote:
>
> It's not an unknown problem -- as George and Ralph were trying to say, it
> was a design decision on our part.
>
> Sadly, flexible dynamic processing is not something that many people ask
> for. We have invested time in it over the year to get it working and have
> a baseline functionality level. Beyond that, we unfortunately simply
> haven't had enough requests to justify spending time to do stuff like you
> suggest (e.g., allow abnormal termination of MPI-disconnected processes
> to not also take down previously-connected processes). :-(

And my responses (which were probably confusing) were some hint as to WHY
it is a hard problem. I have a lot of experience at this level for a very
wide range of systems, and it's something that I would hate to have to
implement even for a single system - let alone for the range of systems
that OpenMPI supports.

I could tell you some horror stories of processes owned by one user taking
down ones owned by OTHER users, because the controlling terminal had been
reused. And, upon investigation, it wasn't even possible to identify a
bug in any of the programs or operating system - it was merely a "gotcha"
that had sneaked through the cracks in the specifications and bitten me
in a painful place.

The following is what I teach about it in my course (in full):

    You can add groups of processes dynamically \break
    {\cyan MPI-2} is probably the best way to do this \break

    \bully My recommendation is don't even {\magenta think} of it \break

    This was a nightmare area in {\cyan PVM} \break
    The potential system problems are unbelievable \break

    And that is even if you are your own {\sky administrator} \break
    If you aren't, you may get strangled for using this \break

Regards,
Nick Maclaren.