Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] dumping checkpoint at customized locations
From: Josh Hursey (jjhursey_at_[hidden])
Date: 2008-09-14 08:12:40


The document attached to the Open MPI Wiki discusses all of the MCA
parameters for checkpoint/restart.
    http://svn.open-mpi.org/trac/ompi/wiki/ProcessFT_CR

There are two ways to save checkpoint file data. I would suggest that
you set these parameters in your $HOME/.openmpi/mca-params.conf file
so you don't have to pass them everytime to mpirun (Assuming $HOME is
shared on all machines).

1) If you save to a globally shared directory (e.g., NFS directory)
then you can set the following MCA paramter in mpirun to point to
this location. This overrides the default directory which is $HOME.
   snapc_base_global_snapshot_dir=$HOME/my/ckpt/dir

2) You can save to the local disk and have Open MPI transfer the
files from local disk to stable storage in a two step process. There
are three MCA parameters you will need to set for this.
To set the directory to save on the local disk you want to set the
following MCA parameter:
   crs_base_snapshot_dir=/tmp
Set the global directory where all of the local checkpoints should be
saved:
   snapc_base_global_snapshot_dir=$HOME/my/ckpt/dir
Activate the two step process:
   snapc_base_store_in_place=0

The C/R User Document on the wiki covers many of these and other
parameters in more detail. I would encourage you to look through
there as well.

Best,
Josh

On Sep 13, 2008, at 7:49 PM, arun dhakne wrote:

> Hi,
>
> I have blcr installed and I am able to dump checkpoints in the $HOME
> using ompi-checkpoint, i was wondering whether there is some option or
> something, so that I would be able to dump the checkpoints at my
> customized location say in /tmp ??
>
> --
> Thanks and Regards,
> Arun
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users