I don't know what else should I try... because it worked on 1.3.3 doing exactly the same steps. I tried to install it both with an active eth interface and an inactive one. I am running on a virtual machine that has CentOS as OS.
I tried the new version, that was uploaded today. I still have that error, just that now is at line 405 instead of 399.
Maybe if I give more details:
- I first had OpenMPI version 1.3.3 with BLCR installed: mpirun, ompi-checkpoint and ompi-restart worked with that version.
- I wanted to update to version 1.4.1 and I uninstalled previous version like this: make uninstall, and than manually deleted all the left over files. the directory where I installed was /usr/local
- I installed 1.4.1 in the same directory: /usr/locale. paths set correctly to usr/local/bin and /usr/local/lib
- mpirun works, ompi-checkpoint gives the following error:
[[35906,0],0] ORTE_ERROR_LOG: Not found in file orte-checkpoint.c at line 405
HNP with PID 7899 Not found!
I would appreciate any help,
AndreeaOn Fri, Jan 15, 2010 at 1:15 PM, Andreea Costea <email@example.com> wrote:
still not working. Though I uninstalled OpenMPI with make uninstall and I manually deleted all other files, I still have the same error when checkpointing.
AndreeaOn Thu, Jan 14, 2010 at 10:38 PM, Joshua Hursey <firstname.lastname@example.org> wrote:
On Jan 14, 2010, at 8:20 AM, Andreea Costea wrote:This looks like an error coming from the 1.3.3 install. In 1.4.1 there is no error at line 399, in 1.3.3 there is. Check your installation of Open MPI, I bet you are mixing 1.4.1 and 1.3.3, which can cause unexpected problems.
> I wanted to try the C/R feature in OpenMPI version 1.4.1 that I have downloaded today. When I want to checkpoint I am having the following error message:
> [[65192,0],0] ORTE_ERROR_LOG: Not found in file orte-checkpoint.c at line 399
> HNP with PID 2337 Not found!
Try a clean installation of 1.4.1 and double check that 1.3.3 is not in your path/lib_path any longer.
> I tried the same thing with version 1.3.3 and it works perfectly.
> Any idea why?
> users mailing list
users mailing list