David Zhang wrote:
> When my MPI code fails (seg fault), it usually cause the rest of the mpi
> process to abort as well. Perhaps rather than calling abort(), perhaps
> you could do a divide-by-zero operation to halt the program?
> David Zhang
> University of California, San Diego
> On Thu, Aug 12, 2010 at 6:49 PM, David Ronis <David.Ronis_at_[hidden]
> <mailto:David.Ronis_at_[hidden]>> wrote:
> I've got a mpi program that is supposed to to generate a core file if
> problems arise on any of the nodes. I tried to do this by adding a
> call to abort() to my exit routines but this doesn't work; I get no core
> file, and worse, mpirun doesn't detect that one of my nodes has
> aborted(?) and doesn't kill off the entire job, except in the trivial
> case where the number of processors I'm running on is 1. I've replaced
> abort with MPI_Abort, which kills everything off, but leaves no core
> file. Any suggestions how I can get one and still have mpi exit?
> Thanks in advance.
Also, make sure your computers' coredumpsize / core file size
limit is not zero, which is sometimes the case.