Dear all.
But its checkpoint does not work for my GASNet applications which
use the MPI conduit.
I wrote some code with GASNet API (Global-Address Space Networking:
http://gasnet.cs.berkeley.edu/)
and used MPI conduit for my gasnet application, so my program ran well with
open-mpirun. Thus I thought that I could also use the transparent
checkpoint/restart function supported by BLCR in Open-mpi. As opposed to my
idea, it does not work and show the following error
message.
--------------------------------------------------------------------------
Error:
The process with PID 13896 is not
checkpointable.
This could be due to one
of the following:
- An application
with this PID doesn't currently exist
- The application with this PID
isn't checkpointable
- The
application with this PID isn't an OPAL
application.
We were looking for the
named files:
/tmp/opal_cr_prog_write.13896
/tmp/opal_cr_prog_read.13896
--------------------------------------------------------------------------
1
more process has sent help message help-opal-checkpoint.txt
Set MCA parameter
"orte_base_help_aggregate" to 0 to see all help
0] 13896) Step
53
0] 15100) Step 53
0] 13896) Step 54
0] 15100) Step
54
0] 13896) Step 55