Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: [OMPI users] OpenMPI checkpoint/restart
From: Sai Sudheesh (saisudheesh_at_[hidden])
Date: 2010-11-04 05:39:58


Hi All,

I am experimenting with the openMPI checkpoint and restart mechanism.
I have installed blcr-0.8.2 in /usr directory of Redhat linux system,RHEL 5
(2.6..18-164.el5).
I have unpacked openmpi-1.4.2 to /usr and installed.

while configuring openMPI I used the following commands
#./configure --with-blcr=cr --enable-ft-thread --enable-mpi-threads
--with-blcr=/usr/local --with-blcr-libdir=/usr/local/lib
#make
#make install

after this i tried to run my application
using
#mpirun -np 4 -am ft-enable-cr a.out

then I got the following error
---------------------------------------------------------------------
It lools like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many
reasons that a parallel process can fail during MPI_INIT; some of due to
configuration or
environment problems. This failure appears to be internal failure; here is
some additionan information(which may only relevant to an openMPI
developer);

 ompi-mpi_ini:orte_init failed
-->Returned "Error" (-1) instead of "Success"(0)
[localhost.localdomain:28655][[INVALID,INVALID] ORTE_ERROR_LOG:Errorin file
runtime/orte_init.c at line 77

***The MPI-Init() function was called before MPI_INIT was invoked.
***This is disallowed by MPI standard.
***Your MPI job will now abort
[localhost.localdomain]Abort before MPI_INIT completed successfully; not
able to guarantee that all other process are killed!

-------------------------------------------------------------------------

section 1 of the error mesage was repeated many times..

What may went wrong with this?
How can I resolve this?

regards
sai sudheesh