Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Signal: Segmentation fault (11) Signal code: Address not mapped (1)
From: Josh Hursey (jjhursey_at_[hidden])
Date: 2009-09-14 09:31:01


The configuration looks fine, but from the stack it seems that the
segv is coming from an invalid free in BLCR (which seems odd to me).

Are you able to get a gdb backtrace from a core file generated from
this run? That would provide a bit more detail on where things are
going wrong.

What version of BLCR are you running? Does BLCR work with sequential
applications?

Additionally, have you tried Open MPI 1.3.3 or the trunk to see if the
problem happen there as well?

-- Josh

On Sep 9, 2009, at 1:49 PM, Jean Potsam wrote:

> Dear All,
> I have installed openmpi 1.3.2 in my home directory
> ( /home/jean/openmpisof/ ) and BLCR in /usr/local/blcr. I have added
> the following in the .bashrc file
>
> export PATH=/home/jean/openmpisof/bin/:$PATH
> export LD_LIBRARY_PATH=/home/jean/openmpisof/lib/:$LD_LIBRARY_PATH
>
> export PATH=/usr/local/blcr/bin/:$PATH
> export LD_LIBRARY_PATH=/usr/local/blcr/lib:$LD_LIBRARY_PATH
>
> I am running my application as follows:
>
> mpirun -am ft-enable-cr -mca btl ^openib -mca
> snapc_base_global_snapshot_dir /tmp mpitest
>
> But I get the following error when i try to checkpoint the
> application.
>
> ######################################
> [sun06:20513] *** Process received signal ***
> [sun06:20513] Signal: Segmentation fault (11)
> [sun06:20513] Signal code: Address not mapped (1)
> [sun06:20513] Failing at address: 0x4
> [sun06:20513] [ 0] [0xb7fab40c]
> [sun06:20513] [ 1] /lib/libc.so.6(cfree+0x3b) [0xb79e468b]
> [sun06:20513] [ 2] /usr/local/blcr/lib/libcr.so.0(cri_info_free
> +0x2a) [0xb7b1725a]
> [sun06:20513] [ 3] /usr/local/blcr/lib/libcr.so.0 [0xb7b18c72]
> [sun06:20513] [ 4] /lib/libc.so.6(__libc_fork+0x186) [0xb7a0d266]
> [sun06:20513] [ 5] /lib/libpthread.so.0(fork+0x14) [0xb7ac4b24]
> [sun06:20513] [ 6] /home/jean/openmpisof/lib/libopen-pal.so.0
> [0xb7bc2a01]
> [sun06:20513] [ 7] /home/jean/openmpisof/lib/libopen-pal.so.
> 0(opal_crs_blcr_checkpoint+0x187) [0xb7bc231b]
> [sun06:20513] [ 8] /home/jean/openmpisof/lib/libopen-pal.so.
> 0(opal_cr_inc_core+0xc3) [0xb7b8eb1d]
> [sun06:20513] [ 9] /home/jean/openmpisof/lib/libopen-rte.so.0
> [0xb7cab40f]
> [sun06:20513] [10] /home/jean/openmpisof/lib/libopen-pal.so.
> 0(opal_cr_test_if_checkpoint_ready+0x129) [0xb7b8ea2a]
> [sun06:20513] [11] /home/jean/openmpisof/lib/libopen-pal.so.0
> [0xb7b8f0f8]
> [sun06:20513] [12] /lib/libpthread.so.0 [0xb7abbf3b]
> [sun06:20513] [13] /lib/libc.so.6(clone+0x5e) [0xb7a42bee]
> [sun06:20513] *** End of error message ***
> #######################################
>
> Any help will be very appreciated.
>
> Regards,
>
> Jean
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users