Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] proper use of MPI_Abort
From: Andrus, Brian Contractor (bdandrus_at_[hidden])
Date: 2013-11-07 14:13:17


Jeff,

Good to know. Thanks!


Seems really like MPI_ABORT should only be used within error traps after MPI functions have been started.
Code-wise, the sample I got was not the best. Usage should be checked before MPI_Initialize, I think :)

It seems the expectation is that MPI_ABORT is only called when the user should be notified something went haywire.


Brian Andrus
ITACS/Research Computing
Naval Postgraduate School
Monterey, California
voice: 831-656-6238




> -----Original Message-----
> From: users [mailto:users-bounces_at_[hidden]] On Behalf Of Jeff
> Squyres (jsquyres)
> Sent: Wednesday, November 06, 2013 11:30 AM
> To: Open MPI Users
> Subject: Re: [OMPI users] proper use of MPI_Abort
>
> I just checked the v1.7 series -- it looks like we have cleaned up this
> message a bit. With your code snipit:
>
> -----
> ❯❯❯ mpicc abort.c -o abort && mpirun -np 4 abort
>
> *# Usage: mpicpy -input <filename>
>
> --------------------------------------------------------------------------
> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
> with errorcode 1.
>
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> You may or may not see output from other processes, depending on exactly
> when Open MPI kills them.
> --------------------------------------------------------------------------
> ----
>
> Notice the lack of the 2nd message.
>
> So I think the answer here is: it's fixed in the 1.7.x series. It is unlikely to be
> fixed in the 1.6.x series.
>
>
>
> On Nov 5, 2013, at 3:16 PM, "Andrus, Brian Contractor" <bdandrus_at_[hidden]>
> wrote:
>
> > Jeff,
> >
> > We are using the latest version: 1.6.5
> >
> >
> > Brian Andrus
> > ITACS/Research Computing
> > Naval Postgraduate School
> > Monterey, California
> > voice: 831-656-6238
> >
> >
> >
> >> -----Original Message-----
> >> From: users [mailto:users-bounces_at_[hidden]] On Behalf Of Jeff
> >> Squyres (jsquyres)
> >> Sent: Tuesday, November 05, 2013 5:11 AM
> >> To: Open MPI Users
> >> Subject: Re: [OMPI users] proper use of MPI_Abort
> >>
> >> You're correct -- you don't need to call MPI_Finalize after MPI_Abort.
> >>
> >> Can you cite what version of Open MPI you are using?
> >>
> >>
> >> On Nov 4, 2013, at 9:01 AM, "Andrus, Brian Contractor"
> >> <bdandrus_at_[hidden]>
> >> wrote:
> >>
> >>> All,
> >>>
> >>> I have some sample code that has a syntax message and then an
> >> MPI_Abort call if the program is run without the required parameters.
> >>> ------snip---------------
> >>> if (!rank) {
> >>> i = 1;
> >>> while ((i < argc) && strcmp("-input", *argv)) {
> >>> i++;
> >>> argv++;
> >>> }
> >>> if (i >= argc) {
> >>> fprintf(stderr, "\n*# Usage: mpicpy -input <filename> \n\n");
> >>> MPI_Abort(MPI_COMM_WORLD, 1);
> >>> }
> >>> ----------snip---------------
> >>>
> >>> This is all well and good and it does provide the usage line, but it
> >>> also
> >> throws quite a message in addition:
> >>>
> >>> --------------------------------------------------------------------
> >>> --
> >>> ---- MPI_ABORT was invoked on rank 0 in communicator
> >> MPI_COMM_WORLD
> >>> with errorcode 1.
> >>>
> >>> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> >>> You may or may not see output from other processes, depending on
> >>> exactly when Open MPI kills them.
> >>> --------------------------------------------------------------------
> >>> --
> >>> ----
> >>> --------------------------------------------------------------------
> >>> --
> >>> ---- mpirun has exited due to process rank 0 with PID 40209 on node
> >>> compute-3-3 exiting improperly. There are two reasons this could occur:
> >>>
> >>> 1. this process did not call "init" before exiting, but others in
> >>> the job did. This can cause a job to hang indefinitely while it
> >>> waits for all processes to call "init". By rule, if one process
> >>> calls "init", then ALL processes must call "init" prior to termination.
> >>>
> >>> 2. this process called "init", but exited without calling "finalize".
> >>> By rule, all processes that call "init" MUST call "finalize" prior
> >>> to exiting or it will be considered an "abnormal termination"
> >>>
> >>> This may have caused other processes in the application to be
> >>> terminated by signals sent by mpirun (as reported here).
> >>> --------------------------------------------------------------------
> >>> --
> >>> ----
> >>>
> >>> Is there a proper way to use MPI_Abort such that it will not trigger
> >>> such a
> >> message?
> >>> It almost seems that MPI_Abort should be calling MPI_Finalize as a
> >>> rule, or
> >> openmpi should recognize MPI_Abort is the exception to requiring
> >> MPI_Finalize.
> >>>
> >>>
> >>>
> >>> Brian Andrus
> >>> ITACS/Research Computing
> >>> Naval Postgraduate School
> >>> Monterey, California
> >>> voice: 831-656-6238
> >>>
> >>>
> >>> _______________________________________________
> >>> users mailing list
> >>> users_at_[hidden]
> >>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >>
> >> --
> >> Jeff Squyres
> >> jsquyres_at_[hidden]
> >> For corporate legal information go to:
> >> http://www.cisco.com/web/about/doing_business/legal/cri/
> >>
> >> _______________________________________________
> >> users mailing list
> >> users_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users