Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] error with checkpoint in openmpi
From: Tran Hai Quan (tranhaiquan.khtn_at_[hidden])
Date: 2011-05-11 13:17:30


Hi , I am working on mpi
I've have installed openmpi 1.4.3 with blcr included.
I ran a simple mpi application using a hostfile:

pc1 slots=2 max-slots=2
pc2 slots=2 max-slots=2

And, i ran command to run it with checkpoint supported
#mpirun --hostfile myhost -np 4 --am ft-enable-cr ./mpi_app

When i checkpointed, i got an error:

[pc1:04836] Error: expected_component: PID information unavailable!
--------------------------------------------------------------------------
Error: The local checkpoint contains invalid or incomplete metadata for
Process 3411083265.2.
       This usually indicates that the local checkpoint is invalid.
       Check the metadata file (snapshot_meta.data) in the following
directory:
         /root/ompi_global_snapshot_4836.ckpt/0/opal_snapshot_2.ckpt
--------------------------------------------------------------------------
[pc1:04836] [[52049,0],0] ORTE_ERROR_LOG: Error in file snapc_full_global.c
at line 1054

I'm glad if anyone can help me.