Currently, Hadoop tasks (in a job) are independent of each. If Hadoop
is going to use MPI for inter-task communication, then make sure they
understand that the MPI standard currently does not address fault
Note that it is not uncommon to run map reduce jobs on Amazon EC2's
spot instances, which can be taken back by Amazon at any time if the
spot price rises above the bid price of the user. If Hadoop is going
to use MPI, and without a fault folerant MPI implementation, then the
whole job needs to be rerun.
Open Grid Scheduler / Grid Engine
Scalable Grid Engine Support Program
On Wed, Feb 1, 2012 at 3:20 PM, Ralph Castain <rhc_at_[hidden]> wrote:
> FROM: LANL, HLRS, Cisco, Oracle, and IBM
> WHAT: Adds Java bindings
> WHY: The Hadoop community would like to use MPI in their efforts, and most of their code is in Java
> WHERE: ompi/mpi/java plus one new config file in ompi/config
> TIMEOUT: Feb 10, 2012
> Hadoop is a Java-based environment for processing extremely large data sets. Modeled on the Google enterprise system, it has evolved into its own open-source community. Currently, they use their own IPC for messaging, but acknowledge that it is nowhere near as efficient or well-developed as found in MPI.
> While 3rd party Java bindings are available, the Hadoop business world is leery of depending on something that "bolts on" - they would be more willing to adopt the technology if it were included in a "standard" distribution. Hence, they have requested that Open MPI provide that capability, and in exchange will help champion broader adoption of Java support within the MPI community.
> We have based the OMPI bindings on the mpiJava code originally developed at IU, and currently maintained by HLRS. Adding the bindings to OMPI is completely transparent to all other OMPI users and has zero performance impact on the rest of the code/bindings. We have setup the configure so that the Java bindings will build if/when they can or are explicitly requested, just as with other language support.
> As the Hadoop community represents a rapidly-growing new set of customers and needs, we feel that adding these bindings is appropriate. The bindings will be maintained by those organizations that have an interest in this use-case.
> devel mailing list
Open Grid Scheduler - The Official Open Source Grid Engine