Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Bug in ompi-restart
From: Josh Hursey (jjhursey_at_[hidden])
Date: 2009-05-21 09:45:45

(Moving to ompi-users so that others might benefit from the answer)

Currently you cannot pass the --prefix option to ompi-restart. In
fact, most of the mpirun options are not exposed by the ompi-restart
tool (it becomes difficult to maintain). To assist users that want to
set additional options, I added the '--apponly' option which dumps an
appfile that can be directly used by mpirun to restart the application.

This appfile allows a user to modify the options to mpirun and each
individual application process as they desire. Once the appfile is
setup then the user can restart the application the same way ompi-
restart does by calling:
   mpirun -am ft-enable-cr --app your_appfile

Hope that helps,

On May 21, 2009, at 9:36 AM, Bouguerra mohamed slim wrote:

> Dear Josh
> Another problem with the ompi-restart. In fact, how i can give to
> ompi-restart the prefix option as in mpirun -prefix ~/xxx/xxx/ompi-
> r21254/lam
> Hint in each node i sure that the LD_LIBRARY_PATH varibale containe
> the right path to ompi library.
> ompi-restart -hostfile hostfile_21_05 ompi_global_snapshot_4664.ckpt/
> bash: orted: command not found
> --------------------------------------------------------------------------
> A daemon (pid 4754) died unexpectedly with status 127 while attempting
> to launch so we are aborting.
> There may be more information reported by the environment (see above).
> This may be because the daemon was unable to find all the needed
> shared
> libraries on the remote node. You may set your LD_LIBRARY_PATH to
> have the
> location of the shared libraries on the remote nodes and this will
> automatically be forwarded to the remote nodes.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> mpirun noticed that the job aborted, but has no info as to the process
> that caused that situation.
> --------------------------------------------------------------------------
> --
> Cordialement,
> Mohamed-Slim BOUGUERRA PhD student INRIA-Grenoble / Projet MOAIS
> ENSIMAG - antenne de Montbonnot
> ZIRST 51, avenue Jean Kuntzmann
> Tel :+33 (0)4 76 61 20 79
> Fax :+33 (0)4 76 61 20 99