Thanks Durga for your reply.
Jeff, once you wrote code for Mandelbrot set to demonstrate fault tolerance
in LAM-MPI. i. e. killing any slave process doesn't
affect others. Exact behaviour I am looking for in Open MPI. I attempted,
but no luck. Can you please tell how to write such programs in Open MPI.
Thanks in advance.
On Thu, Jul 9, 2009 at 8:30 PM, Durga Choudhury <dpchoudh_at_[hidden]> wrote:
> Although I have perhaps the least experience on the topic in this
> list, I will take a shot; more experienced people, please correct me:
> MPI standards specify communication mechanism, not fault tolerance at
> any level. You may achieve network tolerance at the IP level by
> implementing 'equal cost multipath' routes (which means two equally
> capable NIC cards connecting to the same destination and modifying the
> kernel routing table to use both cards; the kernel will dynamically
> load balance.). At the MAC level, you can achieve the same effect by
> trunking multiple network cards.
> You can achieve process level fault tolerance by a checkpointing
> scheme such as BLCR, which has been tested to work with OpenMPI (and
> other processes as well)
> On Thu, Jul 9, 2009 at 4:57 AM, vipin kumar<vipinkumar41_at_[hidden]> wrote:
> > Hi all,
> > I want to know whether open mpi supports Network and process fault
> > or not? If there is any example demonstrating these features that will be
> > best.
> > Regards,
> > --
> > Vipin K.
> > Research Engineer,
> > C-DOTB, India
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> users mailing list