Dear All,
I have install and configured openmpi with BLCR on my laptop:
1) configure and install blcr
./configure --prefix=/usr/local/ --enable-debug=yes --enable-libcr-tracing=yes --enable-kernel-tracing=yes --enable-testsuite=yes --enable-all-static=yes --enable-static=yes
make
make install
2) configure and install openmpi
./configure --prefix=/usr/local/ --enable-picky --enable-debug --enable-mpi-profile --enable-mpi-cxx --enable-pretty-print-stacktrace --enable-binaries --enable-trace --enable-static=yes --enable-debug --with-devel-headers=1 --with-mpi-param-check=always --with-ft=cr --enable-ft-thread --with-blcr=/usr/local/ --with-blcr-libdir=/usr/local/lib --enable-mpi-threads=yes
make all install
3) add the environment variables.
Open the $HOME/.bashrc and added the following:
PATH="/usr/local/include:$PATH"
LD_LIBRARY_PATH="/usr/local/lib:$LD_LIBRARY_PATH"
Now the problem:
I am trying to checkpoint the following MPI application:
#include <stdio.h>
#include <mpi.h>
main(int argc, char **argv)
{
int node;
MPI_Init(&argc,&argv);
MPI_Comm_rank(MPI_COMM_WORLD, &node);
printf("Hello World from Node %d\n",node);
MPI_Finalize();
}
I am running mpirun as follows:
raj-laptop> mpirun -am ft-enable-cr helloworld.
The errors are as follows:
--------------------------------------------------------------------------
It looks like opal_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):
opal_cr_init() failed failed
--> Returned value -1 instead of OPAL_SUCCESS
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[raj-laptop:9439] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
[raj-laptop:09439] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file runtime/orte_init.c at line 77
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
ompi_mpi_init: orte_init failed
--> Returned "Error" (-1) instead of "Success" (0)
--------------------------------------------------------------------------
Is it something to do with me running it on a single node; i.e my laptop? or is it something to do with configurations or libraries?
Any help will be very appreciated.
Regards,
Raj
|