Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] proper use of MPI_Abort
From: Jeff Squyres (jsquyres) (jsquyres_at_[hidden])
Date: 2013-11-06 14:30:20


I just checked the v1.7 series -- it looks like we have cleaned up this message a bit. With your code snipit:

-----
❯❯❯ mpicc abort.c -o abort && mpirun -np 4 abort

*# Usage: mpicpy -input <filename>

--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
----

Notice the lack of the 2nd message.

So I think the answer here is: it's fixed in the 1.7.x series. It is unlikely to be fixed in the 1.6.x series.



On Nov 5, 2013, at 3:16 PM, "Andrus, Brian Contractor" <bdandrus_at_[hidden]> wrote:

> Jeff,
>
> We are using the latest version: 1.6.5
>
>
> Brian Andrus
> ITACS/Research Computing
> Naval Postgraduate School
> Monterey, California
> voice: 831-656-6238
>
>
>
>> -----Original Message-----
>> From: users [mailto:users-bounces_at_[hidden]] On Behalf Of Jeff
>> Squyres (jsquyres)
>> Sent: Tuesday, November 05, 2013 5:11 AM
>> To: Open MPI Users
>> Subject: Re: [OMPI users] proper use of MPI_Abort
>>
>> You're correct -- you don't need to call MPI_Finalize after MPI_Abort.
>>
>> Can you cite what version of Open MPI you are using?
>>
>>
>> On Nov 4, 2013, at 9:01 AM, "Andrus, Brian Contractor" <bdandrus_at_[hidden]>
>> wrote:
>>
>>> All,
>>>
>>> I have some sample code that has a syntax message and then an
>> MPI_Abort call if the program is run without the required parameters.
>>> ------snip---------------
>>> if (!rank) {
>>> i = 1;
>>> while ((i < argc) && strcmp("-input", *argv)) {
>>> i++;
>>> argv++;
>>> }
>>> if (i >= argc) {
>>> fprintf(stderr, "\n*# Usage: mpicpy -input <filename> \n\n");
>>> MPI_Abort(MPI_COMM_WORLD, 1);
>>> }
>>> ----------snip---------------
>>>
>>> This is all well and good and it does provide the usage line, but it also
>> throws quite a message in addition:
>>>
>>> ----------------------------------------------------------------------
>>> ---- MPI_ABORT was invoked on rank 0 in communicator
>> MPI_COMM_WORLD
>>> with errorcode 1.
>>>
>>> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>>> You may or may not see output from other processes, depending on
>>> exactly when Open MPI kills them.
>>> ----------------------------------------------------------------------
>>> ----
>>> ----------------------------------------------------------------------
>>> ---- mpirun has exited due to process rank 0 with PID 40209 on node
>>> compute-3-3 exiting improperly. There are two reasons this could occur:
>>>
>>> 1. this process did not call "init" before exiting, but others in the
>>> job did. This can cause a job to hang indefinitely while it waits for
>>> all processes to call "init". By rule, if one process calls "init",
>>> then ALL processes must call "init" prior to termination.
>>>
>>> 2. this process called "init", but exited without calling "finalize".
>>> By rule, all processes that call "init" MUST call "finalize" prior to
>>> exiting or it will be considered an "abnormal termination"
>>>
>>> This may have caused other processes in the application to be
>>> terminated by signals sent by mpirun (as reported here).
>>> ----------------------------------------------------------------------
>>> ----
>>>
>>> Is there a proper way to use MPI_Abort such that it will not trigger such a
>> message?
>>> It almost seems that MPI_Abort should be calling MPI_Finalize as a rule, or
>> openmpi should recognize MPI_Abort is the exception to requiring
>> MPI_Finalize.
>>>
>>>
>>>
>>> Brian Andrus
>>> ITACS/Research Computing
>>> Naval Postgraduate School
>>> Monterey, California
>>> voice: 831-656-6238
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> --
>> Jeff Squyres
>> jsquyres_at_[hidden]
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users


--
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/