Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Hibernating/Wakeup MPI processes
From: Josh Hursey (jjhursey_at_[hidden])
Date: 2010-04-13 10:55:43


So what you are looking for is checkpoint/restart support, which you
can find some details about at the link below:
   http://osl.iu.edu/research/ft/ompi-cr/

Additionally, we relatively recently added the ability to checkpoint
and 'stop' the application. This generates a usable checkpoint of the
application then sends SIGSTOP. The processes can be continued with
'SIGCONT', but they could also be killed (or otherwise removed from
the system) and then later restarted from the checkpoint. Some details
on this feature are at the link below:
   http://osl.iu.edu/research/ft/ompi-cr/examples.php#uc-ckpt-stop

-- Josh

On Apr 13, 2010, at 10:28 AM, Ralph Castain wrote:

> I believe that is called "checkpoint/restart" - see the FAQ page on
> that subject.
>
> On Apr 13, 2010, at 7:30 AM, Hoelzlwimmer Andreas - S0810595005 wrote:
>
>> Hi,
>>
>> I found in the FAQ that it is possible to suspend/resume MPI jobs.
>> Would it also be possible to Hibernate the jobs (free the memory,
>> serialize it to the hard drive) and continue/wake them up later,
>> possibly at different locations?
>>
>> cheers,
>> Andreas
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users