Open MPI logo

Docs Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Docs mailing list

Subject: [OMPI docs] help me!
From: Yen Phi (ntpyen712_at_[hidden])
Date: 2008-06-21 14:16:33


Hi all,
I run my job with OpenMPI and then checkpint it, it checkpoint when my job end. When I try to restart it, it notifies me that message. I don't know why. Please help me.
 [root_at_localhost ~]# mpirun -np 4 -am ft-enable-cr hello
 [root_at_localhost ~]# ompi-checkpoint 19632
Snapshot Ref.: 0 ompi_global_snapshot_19632.ckpt
[root_at_localhost ~]# ompi-restart ompi_global_snapshot_19632.ckpt
[localhost:19649] *** Process received signal ***
[localhost:19649] Signal: Segmentation fault (11)
[localhost:19649] Signal code: Address not mapped (1)
[localhost:19649] Failing at address: 0x1
[localhost:19649] [ 0] [0x110440]
[localhost:19649] [ 1] /usr/local/lib/libopen-rte.so.0(orte_rmaps_base_claim_slot+0x17b) [0x15db1f]
[localhost:19649] [ 2] /usr/local/lib/openmpi/mca_rmaps_round_robin.so [0x23cb84]
[localhost:19649] [ 3] /usr/local/lib/openmpi/mca_rmaps_round_robin.so [0x23d3ae]
[localhost:19649] [ 4] /usr/local/lib/libopen-rte.so.0(orte_rmaps_base_map_job+0x105) [0x15c61d]
[localhost:19649] [ 5] /usr/local/lib/libopen-rte.so.0(orte_plm_base_setup_job+0xd3) [0x156077]
[localhost:19649] [ 6] /usr/local/lib/openmpi/mca_plm_rsh.so [0x1fecc3]
[localhost:19649] [ 7] mpirun [0x804a79d]
[localhost:19649] [ 8] mpirun [0x8049e76]
[localhost:19649] [ 9] /lib/libc.so.6(__libc_start_main+0xe0) [0x9a0390]
[localhost:19649] [10] mpirun [0x8049da1]
[localhost:19649] *** End of error message ***
Segmentation fault
Thanks
Yen