Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] How to build OMPI with Checkpoint/restart.
From: Josh Hursey (jjhursey_at_[hidden])
Date: 2009-09-14 09:18:54


The config.log looked fine, so I think you have fixed the configure
problem that you previously posted about.

Though the config.log indicates that the BLCR component is scheduled
for compile, ompi_info does not indicate that it is available. I
suspect that the error below is because the CRS could not find any CRS
components to select (though there should have been an error displayed
indicating as such).

I would check your Open MPI installation to make sure that it is the
one that you configured with. Specifically I would check to make sure
that in the installation location there are the following files:
   $install_dir/lib/openmpi/mca_crs_blcr.so
   $install_dir/lib/openmpi/mca_crs_blcr.la

If that checks out, then I would remove the old installation directory
and try reinstalling fresh.

Let me know how it goes.

-- Josh

On Sep 13, 2009, at 5:49 AM, Marcin Stolarek wrote:

> I've tryed another time. Here is what I get when trying to run
> using-1.4a1r21964 :
>
> (terminus:~) mstol% mpirun --am ft-enable-cr ./a.out
> --------------------------------------------------------------------------
> It looks like opal_init failed for some reason; your parallel
> process is
> likely to abort. There are many reasons that a parallel process can
> fail during opal_init; some of which are due to configuration or
> environment problems. This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
>
> opal_cr_init() failed failed
> --> Returned value -1 instead of OPAL_SUCCESS
> --------------------------------------------------------------------------
> [terminus:06120] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file
> runtime/orte_
> init.c at line 79
> --------------------------------------------------------------------------
> It looks like MPI_INIT failed for some reason; your parallel process
> is
> likely to abort. There are many reasons that a parallel process can
> fail during MPI_INIT; some of which are due to configuration or
> environment
> problems. This failure appears to be an internal failure; here's some
> additional information (which may only be relevant to an Open MPI
> developer):
>
> ompi_mpi_init: orte_init failed
> --> Returned "Error" (-1) instead of "Success" (0)
> --------------------------------------------------------------------------
> *** An error occurred in MPI_Init
> *** before MPI was initialized
> *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
> [terminus:6120] Abort before MPI_INIT completed successfully; not
> able to guaran
> tee that all other processes were killed!
> --------------------------------------------------------------------------
> mpirun noticed that the job aborted, but has no info as to the process
> that caused that situation.
> --------------------------------------------------------------------------
>
> I've included config.log and ompi_info --all output in attacment
> LD_LIBRARY_PATH is set correctly.
> Any idea?
>
> marcin
>
>
>
>
>
> 2009/9/12 Marcin Stolarek <mstol_at_[hidden]>
> Hi,
> I'm trying to compile OpenMPI with checkpoint restart via BLCR.
> I'm not sure which path shoul I set as a value of --with-blcr option.
> I'm using 1.3.3 release, which version of BLCR should I use?
>
> I've compiled the newest version of BLCR with --prefix=$BLCR, and
> I've putten as a option to openmpi configure --with-blcr=$BLCR, but
> I recived:
>
>
> configure:76646: checking if MCA component crs:blcr can compile
> configure:76648: result: no
>
> marcin
>
>
>
>
>
> <info.tar.gz>_______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users