I ran into errors when using Open MPI's checkpoint restart functionality.
After debugging the application(ompi-restart) I found
few variables overflow when running MPI application with more than 128
processes. I identified the places that cause an
overflow and changed the definition of the concerned variables.
I have attached a detailed bug-report with the mail describing the error
scenario and changes which I feel should be made.
The patch files corresponding to 2 files which need to be changed are
I request the community to review the changes and incorporate them in the
code in an appropriate way.
Please let me know if more information is need about this.