I am usually the one that responds to checkpoint/restart questions,
but unfortunately I do not have time to look into this issue at the
moment (and probably won't for at least a few more months). There are
a few other developers that work on the checkpoint/restart
functionality that might be able to more immediately help you.
Hopefully they will chime in.
At one point in time (about a year ago) I was able to
checkpoint/restart the NAS benchmarks (and other applications) without
issue. From the error message that you posted earlier, it seems that
something has broken in the 1.6 branch. Unfortunately, I do not have
any advice on an alternative branch to try. The C/R functionality in
the Open MPI trunk is known to be broken. There is a patch for the
trunk making its way through testing at the moment. Once that is
committed then you should be able to use the Open MPI trunk until
someone fixes the 1.6 branch.
Sorry I cannot be of much help. Hopefully others can assist.
On Tue, Jun 19, 2012 at 1:22 AM, Ifeanyi <ifeanyeg2012_at_[hidden]> wrote:
> Please help.
> I configured the open mpi and it can checkpoint HPL.
> However, whenever I want to checkpoint NAS parallel benchmark it kills the
> application without informative message.
> Please how do I configure the openmpi 1.6 to checkpoint NPB? I really need a
> help, I have been on this issue for the past few days without solution
> users mailing list
Postdoctoral Research Associate
Oak Ridge National Laboratory