I would tend to agree with Paul.
It's uncommon (e.g., no one has run into this before now), and I would say that this is a bad application. But then again, hanging is bad -- so it would be better to abort/terminate the whole job in this scenario.
I don't know how I would rate the priority of this, but it would be nice to have someday.
On Dec 15, 2009, at 11:17 PM, Ralph Castain wrote:
> Understandable - and we can count on your patch in the near future, then? :-)
>
> On Dec 15, 2009, at 9:12 PM, Paul H. Hargrove wrote:
>
> > My 0.02USD says that for pragmatic reasons one should attempt to terminate the job in this case, regardless of ones opinion of this unusual application behavior.
> >
> > -Paul
> >
> > Ralph Castain wrote:
> >> Hi folks
> >>
> >> In case you didn't follow this on the user list, we had a question come up about proper OMPI behavior. Basically, the user has an application where one process decides it should cleanly terminate prior to calling MPI_Init, but all the others go ahead and enter MPI_Init. The application hangs since we don't detect the one proc's exit as an abnormal termination (no segfault, and it didn't call MPI_Init so it isn't required to call MPI_Finalize prior to termination).
> >>
> >> I can probably come up with a way to detect this scenario and abort it. But before I spend the effort chasing this down, my question to you MPI folks is:
> >>
> >> What -should- OMPI do in this situation? We have never previously detected such behavior - was this an oversight, or is this simply a "bad" application?
> >>
> >> Thanks
> >> Ralph
> >>
> >>
> >> _______________________________________________
> >> devel mailing list
> >> devel_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >>
> >
> >
> > --
> > Paul H. Hargrove PHHargrove_at_[hidden]
> > Future Technologies Group Tel: +1-510-495-2352
> > HPC Research Department Fax: +1-510-486-6900
> > Lawrence Berkeley National Laboratory
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
--
Jeff Squyres
jsquyres_at_[hidden]
|