On Feb 23, 2014, at 10:42 AM, Saliya Ekanayake <esaliya_at_[hidden]> wrote:
> This is to get some info on the subject and not directly a question on OpenMPI.
> I've Jeff's blog post on integrating OpenMPI with Hadoop (http://blogs.cisco.com/performance/resurrecting-mpi-and-java/) and wanted to check if this is related with the Jira at https://issues.apache.org/jira/browse/MAPREDUCE-2911
Somewhat. A little history might help. I was asked a couple of years ago to work on integrating MPI support with Hadoop. At that time, the thought of those asking for my help was that we would enable YARN to support MPI, which was captured in 2911. However, after working on it for a few months, it became apparent to me that this was a mistake. YARN's architecture makes support of MPI very difficult (but achievable - I did it with OMPI, and someone else has now done it with MPICH), and the result exhibits horrible scaling and relatively poor performance by HPC standards. So if you want to run a very small MPI job under YARN, you can do it with a custom application manager and JNI wrappers around every MPI call - just don't expect great performance.
What I did instead was to pivot direction and focus on porting Hadoop to the HPC environment. Thought here was that, if we could get the Hadoop classes working with a regular HPC environment, then all the HPC world's tools and programming models become available. This is what we have done, and it comes in four parts:
1. Java MPI bindings that are very close to C-level performance. These are being released in the 1.7 series of OMPI and are unique to OMPI at this time. Jose Roman and Oscar Vega continue to close the performance gap.
2. Integration to HPC resource managers such as Slurm and Moab. Intel has taken the lead there and announced that support at SC13 - in beta test now
3. Integration to HPC file systems such as Lustre. Intel again took the lead here and has a Lustre adaptor in beta test
4. Equivalent of an application manager to stage map-reduce executions. I updated OMPI's "mpirun" to handle that - available in the current 1.7 release series. It fully understands "staged" execution and also notifies the associated processes when MPI is feasible (i.e., all the procs in comm_world are running).
We continue to improve the Hadoop support - Cisco and I are collaborating on a new "dynamic MPI" capability that will allow the procs to interact without imposing the barrier at MPI_Init, for example. So I expect that this summer will demonstrate a pretty robust capability in that area.
After all, there is no reason you shouldn't be able to run Hadoop on an HPC cluster :-)
> Also, is there a place I can get more info on this effort?
> Thank you,
> Saliya Ekanayake esaliya_at_[hidden]
> Cell 812-391-4914 Home 812-961-6383
> users mailing list