I don't know what else should I try... because it worked on 1.3.3 doing exactly the same steps. I tried to install it both with an active eth interface and an inactive one. I am running on a virtual machine that has CentOS as OS. 

Any suggestions?

Thanks,
Andreea

On Fri, Jan 15, 2010 at 9:07 PM, Andreea Costea <andre.costea@gmail.com> wrote:
I tried the new version, that was uploaded today. I still have that error, just that now is at line 405 instead of 399.

Maybe if I give more details:
- I first had OpenMPI version 1.3.3 with BLCR installed: mpirun, ompi-checkpoint and ompi-restart worked with that version.
- I wanted to update to version 1.4.1 and I uninstalled previous version like this: make uninstall, and than manually deleted all the left over files. the directory where I installed was /usr/local
- I installed 1.4.1 in the same directory: /usr/locale. paths set correctly  to usr/local/bin and /usr/local/lib
- mpirun works, ompi-checkpoint gives the following error:
[[35906,0],0] ORTE_ERROR_LOG: Not found in file orte-checkpoint.c at line 405
HNP with PID 7899 Not found!

I would appreciate any help,
Andreea



On Fri, Jan 15, 2010 at 1:15 PM, Andreea Costea <andre.costea@gmail.com> wrote:
Hi...
still not working. Though I uninstalled OpenMPI with make uninstall and I manually deleted all other files, I still have the same error when checkpointing.

Any idea?

Thanks,
Andreea



On Thu, Jan 14, 2010 at 10:38 PM, Joshua Hursey <jjhursey@open-mpi.org> wrote:
On Jan 14, 2010, at 8:20 AM, Andreea Costea wrote:

> Hi,
>
> I wanted to try the C/R feature in OpenMPI version 1.4.1 that I have downloaded today. When I want to checkpoint I am having the following error message:
> [[65192,0],0] ORTE_ERROR_LOG: Not found in file orte-checkpoint.c at line 399
> HNP with PID 2337 Not found!

This looks like an error coming from the 1.3.3 install. In 1.4.1 there is no error at line 399, in 1.3.3 there is. Check your installation of Open MPI, I bet you are mixing 1.4.1 and 1.3.3, which can cause unexpected problems.

Try a clean installation of 1.4.1 and double check that 1.3.3 is not in your path/lib_path any longer.

-- Josh

>
> I tried the same thing with version 1.3.3 and it works perfectly.
>
> Any idea why?
>
> thanks,
> Andreea
> _______________________________________________
> users mailing list
> users@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users