Hi
Please help.
I have installed openmpi-1.6, I have also tested the installation with
different mpi applications and my application executed successfully.
Whenever I ran NPB-3.3 LU without checkpointing, NPB-3.3 completes
successfully.
however whenever I checkpointing the application, it aborts without
checkpointing with the following error
"mpirun noticed that process rank 1 with PID 1048 on node node1 exited on
signal 10 (User defined signal 1).
--------------------------------------------------------------------------
2 total processes killed (some possibly by mpirun during cleanup)"
However, when I ran HPL and checkpoint - checkpointing was successfully
completed as well as the application.
I have tried to resolved this without success.
Please I need assistance - I am new user of open mpi.
Regards,
Ifeanyi
|