Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] How to build OMPI with Checkpoint/restart.
From: Josh Hursey (jjhursey_at_[hidden])
Date: 2009-09-14 09:18:54


The config.log looked fine, so I think you have fixed the configure
problem that you previously posted about.

Though the config.log indicates that the BLCR component is scheduled
for compile, ompi_info does not indicate that it is available. I
suspect that the error below is because the CRS could not find any CRS
components to select (though there should have been an error displayed
indicating as such).

I would check your Open MPI installation to make sure that it is the
one that you configured with. Specifically I would check to make sure
that in the installation location there are the following files:
   $install_dir/lib/openmpi/mca_crs_blcr.so
   $install_dir/lib/openmpi/mca_crs_blcr.la

If that checks out, then I would remove the old installation directory
and try reinstalling fresh.

Let me know how it goes.

-- Josh

On Sep 13, 2009, at 5:49 AM, Marcin Stolarek wrote:

> I've tryed another time. Here is what I get when trying to run
> using-1.4a1r21964 :
>
> (terminus:~) mstol% mpirun --am ft-enable-cr ./a.out
> --------------------------------------------------------------------------
> It looks like opal_init failed for some reason; your parallel
> process is
> likely to abort. There are many reasons that a parallel process can
> fail during opal_init; some of which are due to configuration or
> environment problems. This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
>
> opal_cr_init() failed failed
> --> Returned value -1 instead of OPAL_SUCCESS
> --------------------------------------------------------------------------
> [terminus:06120] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file
> runtime/orte_
> init.c at line 79
> --------------------------------------------------------------------------
> It looks like MPI_INIT failed for some reason; your parallel process
> is
> likely to abort. There are many reasons that a parallel process can
> fail during MPI_INIT; some of which are due to configuration or
> environment
> problems. This failure appears to be an internal failure; here's some
> additional information (which may only be relevant to an Open MPI
> developer):
>
> ompi_mpi_init: orte_init failed
> --> Returned "Error" (-1) instead of "Success" (0)
> --------------------------------------------------------------------------
> *** An error occurred in MPI_Init
> *** before MPI was initialized
> *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
> [terminus:6120] Abort before MPI_INIT completed successfully; not
> able to guaran
> tee that all other processes were killed!
> --------------------------------------------------------------------------
> mpirun noticed that the job aborted, but has no info as to the process
> that caused that situation.
> --------------------------------------------------------------------------
>
> I've included config.log and ompi_info --all output in attacment
> LD_LIBRARY_PATH is set correctly.
> Any idea?
>
> marcin
>
>
>
>
>
> 2009/9/12 Marcin Stolarek <mstol_at_[hidden]>
> Hi,
> I'm trying to compile OpenMPI with checkpoint restart via BLCR.
> I'm not sure which path shoul I set as a value of --with-blcr option.
> I'm using 1.3.3 release, which version of BLCR should I use?
>
> I've compiled the newest version of BLCR with --prefix=$BLCR, and
> I've putten as a option to openmpi configure --with-blcr=$BLCR, but
> I recived:
>
>
> configure:76646: checking if MCA component crs:blcr can compile
> configure:76648: result: no
>
> marcin
>
>
>
>
>
> <info.tar.gz>_______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users