I tried to use checkpoint/restart by openmpi.
But I can not get collect checkpoint data.
I prepared execution environment as follows, the strings in () mean
name of output file which attached on next e-mail ( for mail size
1. installed BLCR and checked BLCR is working correctly by "make check"
2. executed ./configure with some parameters on openMPI source dir
(config.output / config.log)
3. executed make and make install (make.output.2 / install.output.2)
4. confirmed that mca_crs_blcr.[la|so], mca_crs_self.[la|so] on
5. make ~/.openmpi/mca-params.conf (mca-params.conf)
6. compiled NPB and executed with -am ft-enable-cr
7. invoked ompi-checkpoint <MPIRUN_PID>
As result, I got the message "Checkpoint failed: no processes checkpointed."
In addition, when I confirmed open_info output as your demo movie, I got
"MCA crs: none (MCA v2.0, API v2.0, Component v1.4.1)" (open_info.output)
How should I do for checkpointing ?
Any guidance in this regard would be highly appreciated.
Hideyuki Jitsumoto (jitumoto_at_[hidden])
Tokyo Institute of Technology
Global Scientific Information and Computing center (Matsuoka Lab.)