Not really - the person who wrote that code for his PhD thesis has since become a professor and rarely has time to respond on the mailing list, nor to maintain the code. So I'm afraid we don't have anyone who knows much about it any more.

I plan to rework the checkpoint support in upcoming months, but can't say when that will occur.

On Sep 21, 2013, at 7:51 AM, basma a.azeem <basmaabdelazeem@hotmail.com> wrote:

Any Suggestions



From: basmaabdelazeem@hotmail.com
To: users@open-mpi.org
Subject: FT problem
Date: Wed, 18 Sep 2013 16:42:29 +0200

i am using openmpi-1.6.1 
i need to try checkpoint restart ( self , blcr )
after i installed openmpi i had the following in my installation folder :

bin\ ompi-checkpoint
bin\ompi-restart

lib\openmpi\mca_crs_self.la
lib\openmpi\mca_crs_self.so
lib\openmpi\mca_crs_blcr.la
lib\openmpi\mca_crs_blcr.so

although i have:

ompi_info | grep FT 
FT Checkpoint support: yes (checkpoint thread: yes)

ompi_info | grep crs
MCA crs: none (MCA v2.0, API v2.0, Component v1.6.1)

when i try to use checkpoint it failed:

basma@basma-Satellite-A500:~$  /OpenMP/openmpi-1.6.1/builddir/bin/mpirun -np 3  -am ft-enable-cr  /home/basma/NPB3.3/NPB3.3/NPB3.3-OMP/bin/lu.A


 NAS Parallel Benchmarks (NPB3.3-OMP) - LU Benchmark

 Size:   64x  64x  64
 Iterations:                    250
 Number of available threads:     4

 NAS Parallel Benchmarks (NPB3.3-OMP) - LU Benchmark

 Size:   64x  64x  64
 Iterations:                    250
 Number of available threads:     4

 NAS Parallel Benchmarks (NPB3.3-OMP) - LU Benchmark

 Size:   64x  64x  64
 Iterations:                    250
 Number of available threads:     4

 Time step    1
 Time step    1
 Time step    1
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 2917 on node basma-Satellite-A500 exited on signal 10 (User defined signal 1).
--------------------------------------------------------------------------
basma@basma-Satellite-A500:~$ 

this resulted when i run this command from shell 2 :
basma@basma-Satellite-A500:~$ /OpenMP/openmpi-1.6.1/builddir/bin/ompi-checkpoint 2916

what i did wrong?

thank you
_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users