Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Checkpointing automatically at regular intervals
From: Josh Hursey (jjhursey_at_[hidden])
Date: 2009-07-02 08:53:36


I created a feature ticket for this if you wanted to track it:
   https://svn.open-mpi.org/trac/ompi/ticket/1961

I do not know when I will have time look at implementing this (of
course patches from the community are always welcome). But hopefully
in the next couple months.

Cheers,
Josh

On Jun 30, 2009, at 11:37 AM, Kritiraj Sajadah wrote:

>
> Dear Josh,
> I am sure it will definitely be good because if someone
> is using OPEN MPI for checkpointing his application, he will not
> want to sit and checkpoint the application manually; and this can be
> a real pain if its a long running application.
>
> I would imagine an automatic restart from the last checkpoint in
> case of failure would also be interesting.
>
> Many thanks.
>
> Regards,
>
> Kritiraj
>
> --- On Tue, 6/30/09, Josh Hursey <jjhursey_at_[hidden]> wrote:
>
>> From: Josh Hursey <jjhursey_at_[hidden]>
>> Subject: Re: [OMPI users] Checkpointing automatically at regular
>> intervals
>> To: "Open MPI Users" <users_at_[hidden]>
>> Date: Tuesday, June 30, 2009, 3:00 PM
>> Currently, there is no mechanism to
>> checkpoint every X minutes in Open MPI.
>>
>> As mentioned below you can use a script to initiate the
>> checkpoint every X minutes. Alternatively it should not be
>> too difficult to add such a feature to Open MPI. If enough
>> people would be interested I can file a feature bug to add
>> such a feature in a future release.
>>
>> Josh
>>
>> On Jun 30, 2009, at 9:34 AM, Mohamed Slim bouguerra wrote:
>>
>>> Hi,
>>> I think that you can write a simple script such as:
>>>
>>> wihle `pgrep mpirun` != ""
>>> ompi-checkpoint `pidof mpirun`
>>> sleep 5
>>> done
>>>
>>> Le 30 juin 09 à 14:29, Kritiraj Sajadah a écrit :
>>>
>>>>
>>>> Dear All,
>>>> I can manually
>> checkpoint an MPI application using OPEN MPI and BLCR.
>> However, I now want to checkpointing my application
>> automatically at every 5 minutes. Is there a way in OPEN MPI
>> to ensure automatic checkpointing without the user
>> intervention while the application is running?
>>>>
>>>> Thank you
>>>>
>>>> Regards,
>>>> Kritiraj
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users