Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Checkpoint problem
From: Ralph Castain (rhc_at_[hidden])
Date: 2008-08-20 08:39:48


There was a bug that caused ompi-checkpoint not to find the correct
place in the session directory for mpirun's contact file. This was
fixed in r19265, so you should no longer have a problem.

On Aug 20, 2008, at 2:11 AM, Matthias Hovestadt wrote:

> Hi Gabriele!
>
>> In this case, mpirun works well, but the checkpoint procedure fails:
>> ompi-checkpoint 20109
>> [node0316:20134] Error: Unable to get the current working directory
>> [node0316:20134] [[42404,0],0] ORTE_ERROR_LOG: Not found in file
>> orte-checkpoint.c at line 395
>> [node0316:20134] HNP with PID 20109 Not found!
>
> I had exactly the same problem on my machine. Neither modifying
> the configure parameters nor the way of invoking the ompi-checkpoint
> command did help. Since I am using the source from subversion
> checkout,
> I also updated the source several times, following the day to day
> progress. However, this problem remained.
>
> Luckily, updating the source to SVN revision 19265 finally solved
> this checkpointing issue. Maybe the problem shows up again in later
> versions...
>
>
> Best,
> Matthias
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users