Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Related to project ideas in OpenMPI
From: Rayson Ho (raysonlogin_at_[hidden])
Date: 2011-08-25 12:06:11


Don't know which SSI project you are referring to... I only know the
OpenSSI project, and I was one of the first who subscribed to its
mailing list (since 2001).

http://openssi.org/cgi-bin/view?page=openssi.html

I don't think those OpenSSI clusters are designed for tens of
thousands of nodes, and not sure if it scales well to even a thousand
nodes -- so IMO they have limited use for HPC clusters.

Rayson

On Thu, Aug 25, 2011 at 11:45 AM, Durga Choudhury <dpchoudh_at_[hidden]> wrote:
> Also, in 2005 there was an attempt to implement SSI (Single System
> Image) functionality to the then-current 2.6.10 kernel. The proposal
> was very detailed and covered most of the bases of task creation, PID
> allocation etc across a loosely tied cluster (without using fancy
> hardware such as RDMA fabric). Anybody knows if it was ever
> implemented? Any pointers in this direction?
>
> Thanks and regards
> Durga
>
>
> On Thu, Aug 25, 2011 at 11:08 AM, Rayson Ho <raysonlogin_at_[hidden]> wrote:
>> Srinivas,
>>
>> There's also Kernel-Level Checkpointing vs. User-Level Checkpointing -
>> if you can checkpoint an MPI task and restart it on a new node, then
>> this is also "process migration".
>>
>> Of course, doing a checkpoint & restart can be slower than pure
>> in-kernel process migration, but the advantage is that you don't need
>> any kernel support, and can in fact do all of it in user-space.
>>
>> Rayson
>>
>>
>> On Thu, Aug 25, 2011 at 10:26 AM, Ralph Castain <rhc_at_[hidden]> wrote:
>>> It also depends on what part of migration interests you - are you wanting to look at the MPI part of the problem (reconnecting MPI transports, ensuring messages are not lost, etc.) or the RTE part of the problem (where to restart processes, detecting failures, etc.)?
>>>
>>>
>>> On Aug 24, 2011, at 7:04 AM, Jeff Squyres wrote:
>>>
>>>> Be aware that process migration is a pretty complex issue.
>>>>
>>>> Josh is probably the best one to answer your question directly, but he's out today.
>>>>
>>>>
>>>> On Aug 24, 2011, at 5:45 AM, srinivas kundaram wrote:
>>>>
>>>>> I am final year grad student looking for my final year project in OpenMPI.We are group of 4 students.
>>>>> I wanted to know about the "Process Migration" process of MPI processes in OpenMPI.
>>>>> Can anyone suggest me any ideas for project related to process migration in OenMPI or other topics in Systems.
>>>>>
>>>>>
>>>>>
>>>>> regards,
>>>>> Srinivas Kundaram
>>>>> srinu1034_at_[hidden]
>>>>> +91-8149399160
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>>
>>>> --
>>>> Jeff Squyres
>>>> jsquyres_at_[hidden]
>>>> For corporate legal information go to:
>>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>>
>>
>> --
>> Rayson
>>
>> ==================================================
>> Open Grid Scheduler - The Official Open Source Grid Engine
>> http://gridscheduler.sourceforge.net/
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

-- 
Rayson
==================================================
Open Grid Scheduler - The Official Open Source Grid Engine
http://gridscheduler.sourceforge.net/