I finally figured out the answer. I just put the parameter "-machinefile
host" in the "ompi-restart" command and it restarted correctly. So is it
unable to restart multi-threaded application on 1 node in OpenMPI?
On Tue, Jun 8, 2010 at 12:07 AM, Nguyen Toan <nguyentoan1508_at_[hidden]>wrote:
> Sorry, I just want to add 2 more things:
> + I tried configure with and without --enable-ft-thread but nothing changed
> + I also applied this patch for OpenMPI here and reinstalled but I got the
> same error
> Somebody helps? Thank you very much.
> Nguyen Toan
> On Mon, Jun 7, 2010 at 11:51 PM, Nguyen Toan <nguyentoan1508_at_[hidden]>wrote:
>> Hello everyone,
>> I'm using OpenMPI 1.4.2 with BLCR 0.8.2 to test checkpointing on 2 nodes
>> but it failed to restart (Segmentation fault).
>> Here are the details concerning my problem:
>> + OS: Centos 5.4
>> + OpenMPI configure:
>> ./configure --with-ft=cr --enable-ft-thread --enable-mpi-threads \
>> --with-blcr-libdir=/home/nguyen/opt/blcr/lib \
>> --prefix=/home/nguyen/opt/openmpi \
>> + mpirun -am ft-enable-cr -machinefile host ./test
>> I checkpointed the test program using "ompi-checkpoint -v -s PID" and the
>> checkpoint file was created successfully. However it failed to restart using
>> *"mpirun noticed that process rank 0 with PID 21242 on node rc014.local
>> exited on signal 11 (Segmentation fault)"
>> Did I miss something in the installation of OpenMPI?
>> Nguyen Toan