I'm afraid we have lost our checkpoint/restart support, so we probably won't be able to address this unless he just happens to glance in at some time. Only suggestion I could make is to not enable the thread options as thread support is weak at best.

On Jan 4, 2013, at 4:34 PM, William Au <au_wai_chung@hotmail.com> wrote:

Hi, 

I encountered a core dump when using ompi-checkpoint  --term pid.

Here is the trace:

[genova:01808] *** Process received signal ***
[genova:01808] Signal: Segmentation fault (11)
[genova:01808] Signal code: Address not mapped (1)
[genova:01808] Failing at address: 0x90
[genova:01808] [ 0] /lib64/libpthread.so.0 [0x3a78a0ebe0]
[genova:01808] [ 1] /import/cad-capex2/wa156553/openmpi-1.6_x86_64_i4/lib/openmpi/mca_crcp_bkmrk.so [0x2aaaaefe110b]
[genova:01808] [ 2] /import/cad-capex2/wa156553/openmpi-1.6_x86_64_i4/lib/openmpi/mca_crcp_bkmrk.so [0x2aaaaefe4952]
[genova:01808] [ 3] /import/cad-capex2/wa156553/openmpi-1.6_x86_64_i4/lib/openmpi/mca_crcp_bkmrk.so(ompi_crcp_bkmrk_pml_ft_event+0x74e) [0x2aaaaefe5b9e]
[genova:01808] [ 4] /import/cad-capex2/wa156553/openmpi-1.6_x86_64_i4/lib/openmpi/mca_pml_crcpw.so(mca_pml_crcpw_ft_event+0x59) [0x2aaaacc1eea9]
[genova:01808] [ 5] /import/cad-capex2/wa156553/openmpi-1.6_x86_64_i4/lib/libmpi.so.1(ompi_cr_coord+0xe0) [0x2b95b29a5690]
[genova:01808] [ 6] /import/cad-capex2/wa156553/openmpi-1.6_x86_64_i4/lib/libmpi.so.1(opal_cr_inc_core_prep+0xc) [0x2b95b2a6017c]
[genova:01808] [ 7] /import/cad-capex2/wa156553/openmpi-1.6_x86_64_i4/lib/openmpi/mca_snapc_full.so [0x2aaaab7d9d15]
[genova:01808] [ 8] /import/cad-capex2/wa156553/openmpi-1.6_x86_64_i4/lib/libmpi.so.1(opal_cr_test_if_checkpoint_ready+0x52) [0x2b95b2a60282]
[genova:01808] [ 9] /import/cad-capex2/wa156553/openmpi-1.6_x86_64_i4/lib/libmpi.so.1 [0x2b95b2a60ec1]
[genova:01808] [10] /lib64/libpthread.so.0 [0x3a78a0677d]
[genova:01808] [11] /lib64/libc.so.6(clone+0x6d) [0x3a77ad3c1d]
[genova:01808] *** End of error message ***
[genova:01807] local) Error: Unable to read state from named pipe (/tmp/opal_cr_prog_write.1808). 0
[genova:01807] [[8178,0],0] ORTE_ERROR_LOG: Error in file snapc_full_local.c at line 1602
[genova:01807] local) Error: Unable to read state from named pipe (/tmp/opal_cr_prog_write.1810). 0
[genova:01807] [[8178,0],0] ORTE_ERROR_LOG: Error in file snapc_full_local.c at line 1602
[genova:01807] local) Error: Unable to read state from named pipe (/tmp/opal_cr_prog_write.1809). 0
[genova:01807] [[8178,0],0] ORTE_ERROR_LOG: Error in file snapc_full_local.c at line 1602

I configure with the following options:

./configure  --enable-heterogeneous --enable-cxx-exceptions --enable-shared --enable-orterun-prefix-by-default --enable-mpi-f90 --with-mpi-f90-size=small --with-ft=cr --with-blcr=/opt/blcr --with-blcr-libdir=/opt/blcr/lib --enable-ft-thread --enable-opal-multi-threads

I am using openmpi 1.6.

Any idea where I should look?

Thanks.

Regards,

William
_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users