There's also Kernel-Level Checkpointing vs. User-Level Checkpointing -
if you can checkpoint an MPI task and restart it on a new node, then
this is also "process migration".
Of course, doing a checkpoint & restart can be slower than pure
in-kernel process migration, but the advantage is that you don't need
any kernel support, and can in fact do all of it in user-space.
On Thu, Aug 25, 2011 at 10:26 AM, Ralph Castain <rhc_at_[hidden]> wrote:
> It also depends on what part of migration interests you - are you wanting to look at the MPI part of the problem (reconnecting MPI transports, ensuring messages are not lost, etc.) or the RTE part of the problem (where to restart processes, detecting failures, etc.)?
> On Aug 24, 2011, at 7:04 AM, Jeff Squyres wrote:
>> Be aware that process migration is a pretty complex issue.
>> Josh is probably the best one to answer your question directly, but he's out today.
>> On Aug 24, 2011, at 5:45 AM, srinivas kundaram wrote:
>>> I am final year grad student looking for my final year project in OpenMPI.We are group of 4 students.
>>> I wanted to know about the "Process Migration" process of MPI processes in OpenMPI.
>>> Can anyone suggest me any ideas for project related to process migration in OenMPI or other topics in Systems.
>>> Srinivas Kundaram
>>> users mailing list
>> Jeff Squyres
>> For corporate legal information go to:
>> users mailing list
> users mailing list
Open Grid Scheduler - The Official Open Source Grid Engine