On Apr 14, 2011, at 4:02 AM, N.M. Maclaren wrote:
> ... It's hopeless, and whatever you do will be wrong for many
> people. ...
I think that sums it up pretty well. :-)
It does seem a little strange that the scenario you describe somewhat implies that one process is calling MPI_Finalize loooong before the others do. Specifically, the user is concerned with tying up resources after one process has called Finalize -- which implies that the others may continue on for a while. It's not invalid, of course, but it is a little unusual.
I see two possibilities here:
1. have the user delay calling MPI_Finalize in the application until it can do the test that indicates that the rest of the job should be aborted (i.e., so that it can still call MPI_Abort if it wants to). Don't forget that an implementation is allowed to block in MPI_Finalize until all processes call MPI_Finalize, anyway.
2. add an MCA param and/or orterun CLI option to abort a job if an MPI process terminates after MPI_Finalize with a nonzero exit status.
Just my $0.02. :-)
For corporate legal information go to: