Open MPI logo

OpenRCM Devel Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all OpenRCM Devel mailing list

Subject: [orcm-devel] query
From: sanjay kumar jaiswal (hitler0007_at_[hidden])
Date: 2010-02-10 05:39:49


hi....
      I am currently working on OMPI in order to handle fault
tolerance in distributed environment for parallel computing where
infrastructure is not fixed. after assigning the task to n number of
nodes if link goes down or any process just gets failed then whole
process either get hanged or rolled back. i want to handle this
situation in followings ways-
1- to detect whether process gets failed or link goes down
2- after detecting fault, how this task can be reassigned to another
available node

        so right now to handle this type of error is not possible in
OMPI as of my knowledge. I want to know about the ORCM project what
are the problems that is going to be solved and could I know approx
what time first version is going to be released.

thanks
sanjay jaiswal