Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] RFC: add asynchronous copies for large GPU buffers
From: Rolf vandeVaart (rvandevaart_at_[hidden])
Date: 2012-06-27 18:26:41


Whoops. Fixed.

Rolf

>-----Original Message-----
>From: devel-bounces_at_[hidden] [mailto:devel-bounces_at_[hidden]]
>On Behalf Of Nathan Hjelm
>Sent: Wednesday, June 27, 2012 6:20 PM
>To: Open MPI Developers
>Subject: Re: [OMPI devel] RFC: add asynchronous copies for large GPU
>buffers
>
>Can you make your repository public or add me to the access list?
>
>-Nathan
>
>On Wed, Jun 27, 2012 at 03:12:34PM -0700, Rolf vandeVaart wrote:
>> WHAT: Add support for doing asynchronous copies of GPU memory with
>larger messages.
>> WHY: Improve performance for sending/receiving of larger GPU messages
>> over IB
>> WHERE: ob1, openib, and convertor code. All is protected by compiler
>directives
>> so no effect on non-CUDA builds.
>> REFERENCE BRANCH: https://bitbucket.org/rolfv/ompi-trunk-cuda-async
>>
>> DETAILS:
>> When sending/receiving GPU memory through IB, all data first passes into
>host memory.
>> The copy of GPU memory into and out of the host memory can be done
>> asynchronously to improve performance. This RFC adds that feature for the
>fragments of larger messages.
>>
>> On the sending side, the completion function is essentially broken in
>> two. The first function is called when the copy completes which then
>> initiates the send. When the send completes, the second function is called.
>>
>> Likewise, on the receiving side, a callback is called when the
>> fragment arrives which initiates the copy of the data out of the
>> buffer. When the copy completes, a second function is called which
>> also calls back into the BTL so it can free resources that were being used.
>>
>> M opal/datatype/opal_datatype_copy.c
>> M opal/datatype/opal_convertor.c
>> M opal/datatype/opal_convertor.h
>> M opal/datatype/opal_datatype_cuda.c
>> M opal/datatype/opal_datatype_cuda.h
>> M opal/datatype/opal_datatype_unpack.c
>> M opal/datatype/opal_datatype_pack.h
>> M opal/datatype/opal_datatype_unpack.h
>> M ompi/mca/btl/btl.h
>> M ompi/mca/btl/openib/btl_openib_component.c
>> M ompi/mca/btl/openib/btl_openib.c
>> M ompi/mca/btl/openib/btl_openib.h
>> M ompi/mca/btl/openib/btl_openib_mca.c
>> M ompi/mca/pml/ob1/pml_ob1_recvfrag.c
>> M ompi/mca/pml/ob1/pml_ob1_sendreq.c
>> M ompi/mca/pml/ob1/pml_ob1_progress.c
>> M ompi/mca/pml/ob1/pml_ob1_recvreq.c
>> M ompi/mca/pml/ob1/pml_ob1_cuda.c
>> M ompi/mca/pml/ob1/pml_ob1_recvreq.h
>> ----------------------------------------------------------------------
>> ------------- This email message is for the sole use of the intended
>> recipient(s) and may contain confidential information. Any
>> unauthorized review, use, disclosure or distribution is prohibited.
>> If you are not the intended recipient, please contact the sender by
>> reply email and destroy all copies of the original message.
>> ----------------------------------------------------------------------
>> -------------
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>_______________________________________________
>devel mailing list
>devel_at_[hidden]
>http://www.open-mpi.org/mailman/listinfo.cgi/devel