Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] OpenMPI + Hadoop
From: Saliya Ekanayake (esaliya_at_[hidden])
Date: 2014-03-21 16:11:02


Hi Ralph,

This is regarding the MapReduce support with OpenMPI for which you gave a
good amount of info previously. I have several MR applications that I'd
like to test for performance in an HPC cluster with OpenMPI. I found this
presentation by you
http://www.open-mpi.org/video/mrplus/Greenplum_RalphCastain-1up.pdf, but
wonder if there's some detailed steps on getting a simple MR program
running with OpenMPI.

Thank you,
Saliya

On Mon, Feb 24, 2014 at 1:22 PM, Saliya Ekanayake <esaliya_at_[hidden]> wrote:

> Thank you Ralph. I'll get back to you if I run into issues.
>
>
> On Mon, Feb 24, 2014 at 12:23 PM, Ralph Castain <rhc_at_[hidden]> wrote:
>
>>
>> On Feb 24, 2014, at 7:55 AM, Saliya Ekanayake <esaliya_at_[hidden]> wrote:
>>
>> This is very interesting. I've been working on getting one of our
>> clustering programs (
>> http://grids.ucs.indiana.edu/ptliupages/publications/DAVS_IEEE.pdf) to
>> work with OpenMPI Java binding and we obtained very good speedup and
>> scalability when run on HPC clusters with Infiniband. We are working on a
>> report with performance results and will make it available here soon.
>>
>>
>> Great! Will look forward to seeing it.
>>
>>
>> This is again interesting as we have a series of MapReduce applications
>> that we have developed in analyzing gene sequences (
>> http://grids.ucs.indiana.edu/ptliupages/publications/DACIDR_camera_ready_v0.3.pdf),
>> which could benefit from having MPI support. Also, as you have mentioned,
>> we run all these MapReduce jobs on HPC clusters.
>>
>>
>> The folks at TACC are doing the Intel beta on a mouse genome, and will
>> also be publishing their results comparing Hadoop performance under
>> YARN/HDFS vs Slurm/Lustre.
>>
>>
>> I am very eager to try 4.) and wonder if you could kindly provide some
>> pointers on how to get it working.
>>
>>
>> The current release contains the initial "staged" execution support, but
>> not the dynamic extension I described. To use staged execution, all you
>> have to do is:
>>
>> (a) express your mapper and reducer stages as separate app_contexts on
>> the command line; and
>>
>> (b) add --staged to the cmd line to request staged execution.
>>
>> So it looks something like this:
>>
>> mpirun --staged -n 10 ./mapper; -n 4 ./reducer
>>
>> Depending on the allocation, mpirun will stage execution of the mappers
>> and reducers, connecting the stdout of the first to the stdin of the
>> second. There is also support for localized file systems (see the
>> orte/mca/dfs framework) that allows you to transparently access/move data
>> across the network, and of course mpirun supports pre-positioning of files
>> via the --preload-files option.
>>
>> HTH - feel free to ask questions and we'll be happy to help. Also, if you
>> want to collaborate on the dynamic extension, we'd welcome the assist. Both
>> Jeff and I have been somewhat swamped with other priorities and so progress
>> on that last step is lagging.
>>
>> Ralph
>>
>>
>> Thank you,
>> Saliya
>>
>>
>>
>> On Mon, Feb 24, 2014 at 10:30 AM, Ralph Castain <rhc_at_[hidden]> wrote:
>>
>>>
>>> On Feb 23, 2014, at 10:42 AM, Saliya Ekanayake <esaliya_at_[hidden]>
>>> wrote:
>>>
>>> Hi,
>>>
>>> This is to get some info on the subject and not directly a question on
>>> OpenMPI.
>>>
>>> I've Jeff's blog post on integrating OpenMPI with Hadoop (
>>> http://blogs.cisco.com/performance/resurrecting-mpi-and-java/) and
>>> wanted to check if this is related with the Jira at
>>> https://issues.apache.org/jira/browse/MAPREDUCE-2911
>>>
>>>
>>> Somewhat. A little history might help. I was asked a couple of years ago
>>> to work on integrating MPI support with Hadoop. At that time, the thought
>>> of those asking for my help was that we would enable YARN to support MPI,
>>> which was captured in 2911. However, after working on it for a few months,
>>> it became apparent to me that this was a mistake. YARN's architecture makes
>>> support of MPI very difficult (but achievable - I did it with OMPI, and
>>> someone else has now done it with MPICH), and the result exhibits horrible
>>> scaling and relatively poor performance by HPC standards. So if you want to
>>> run a very small MPI job under YARN, you can do it with a custom
>>> application manager and JNI wrappers around every MPI call - just don't
>>> expect great performance.
>>>
>>> What I did instead was to pivot direction and focus on porting Hadoop to
>>> the HPC environment. Thought here was that, if we could get the Hadoop
>>> classes working with a regular HPC environment, then all the HPC world's
>>> tools and programming models become available. This is what we have done,
>>> and it comes in four parts:
>>>
>>> 1. Java MPI bindings that are very close to C-level performance. These
>>> are being released in the 1.7 series of OMPI and are unique to OMPI at this
>>> time. Jose Roman and Oscar Vega continue to close the performance gap.
>>>
>>> 2. Integration to HPC resource managers such as Slurm and Moab. Intel
>>> has taken the lead there and announced that support at SC13 - in beta test
>>> now
>>>
>>> 3. Integration to HPC file systems such as Lustre. Intel again took the
>>> lead here and has a Lustre adaptor in beta test
>>>
>>> 4. Equivalent of an application manager to stage map-reduce executions.
>>> I updated OMPI's "mpirun" to handle that - available in the current 1.7
>>> release series. It fully understands "staged" execution and also notifies
>>> the associated processes when MPI is feasible (i.e., all the procs in
>>> comm_world are running).
>>>
>>> We continue to improve the Hadoop support - Cisco and I are
>>> collaborating on a new "dynamic MPI" capability that will allow the procs
>>> to interact without imposing the barrier at MPI_Init, for example. So I
>>> expect that this summer will demonstrate a pretty robust capability in that
>>> area.
>>>
>>> After all, there is no reason you shouldn't be able to run Hadoop on an
>>> HPC cluster :-)
>>>
>>> HTH
>>> Ralph
>>>
>>>
>>> Also, is there a place I can get more info on this effort?
>>>
>>> Thank you,
>>> Saliya
>>>
>>> --
>>> Saliya Ekanayake esaliya_at_[hidden]
>>> Cell 812-391-4914 Home 812-961-6383
>>> http://saliya.org
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>>
>>
>> --
>> Saliya Ekanayake esaliya_at_[hidden]
>> Cell 812-391-4914 Home 812-961-6383
>> http://saliya.org
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
>
> --
> Saliya Ekanayake esaliya_at_[hidden]
> Cell 812-391-4914 Home 812-961-6383
> http://saliya.org
>

-- 
Saliya Ekanayake esaliya_at_[hidden]
Cell 812-391-4914 Home 812-961-6383
http://saliya.org