WHAT: Add new sm BTL, and supporting mpools, that can also support CUDA RDMA.

 

WHY: With CUDA 4.1, there is some GPU IPC support available that we can take advantage of to move data efficiently between GPUs within a node.

 

WHERE: new--> ompi/mca/btl/smcuda, ompi/mca/mpool/cuda, ompi/mca/mpool/rcuda Along with a few minor changes in ob1.  These new components are only built if explicitly asked for by configure.  Otherwise, new components are not built, and there are no changes within normal code paths.

(Jeff's rule: Do no harm)

 

WHEN: Two weeks from now, December 23, 2011

 

DETAILS: There is the ability to improve that transfer of GPU memory between GPUs within a node by making use of some IPC support that is soon to be available with CUDA 4.1.  These changes take advantage of that to implement a RDMA GET protocol for GPU memory.

 

To prevent any pollution within existing sm BTL, a new one has been created that has the added RDMA GET support.  In addition, two new memory pools are needed as well which are being added.  One of the memory pools is very simple whereas the second one is patterned after the rdma memory pool.

 

Changes can be viewed at:

https://bitbucket.org/rolfv/ompi-trunk-cuda-rdma-3/changeset/29f3255cd2b8

 

M       ompi/mca/btl/btl.h

A       ompi/mca/btl/smcuda

A       ompi/mca/btl/smcuda/btl_smcuda_component.c

A       ompi/mca/btl/smcuda/configure.m4

A       ompi/mca/btl/smcuda/btl_smcuda_frag.h

A       ompi/mca/btl/smcuda/help-mpi-btl-smcuda.txt

A       ompi/mca/btl/smcuda/btl_smcuda_endpoint.h

A       ompi/mca/btl/smcuda/btl_smcuda.h

A       ompi/mca/btl/smcuda/btl_smcuda_fifo.h

A       ompi/mca/btl/smcuda/Makefile.am

A       ompi/mca/btl/smcuda/btl_smcuda_frag.c

A       ompi/mca/btl/smcuda/btl_smcuda.c

A       ompi/mca/mpool/cuda

A       ompi/mca/mpool/cuda/configure.m4

A       ompi/mca/mpool/cuda/mpool_cuda_component.c

A       ompi/mca/mpool/cuda/mpool_cuda_module.c

A       ompi/mca/mpool/cuda/mpool_cuda.h

A       ompi/mca/mpool/cuda/Makefile.am

A       ompi/mca/mpool/rcuda

A       ompi/mca/mpool/rcuda/configure.m4

A       ompi/mca/mpool/rcuda/mpool_rcuda_component.c

A       ompi/mca/mpool/rcuda/Makefile.am

A       ompi/mca/mpool/rcuda/mpool_rcuda_module.c

A       ompi/mca/mpool/rcuda/mpool_rcuda.h

M       ompi/mca/common/cuda/configure.m4

M       ompi/mca/common/cuda/common_cuda.c

M       ompi/mca/common/cuda/help-mpi-common-cuda.txt

M       ompi/mca/common/cuda/common_cuda.h

M       ompi/mca/pml/ob1/pml_ob1_sendreq.c

M       ompi/mca/pml/ob1/pml_ob1_sendreq.h

M       ompi/mca/pml/ob1/pml_ob1_recvreq.c

A       ompi/mca/pml/ob1/pml_ob1_cuda.c

M       ompi/mca/pml/ob1/Makefile.am

 

Rolf

 

rvandevaart@nvidia.com

781-275-5358

 


This email message is for the sole use of the intended recipient(s) and may contain confidential information.  Any unauthorized review, use, disclosure or distribution is prohibited.  If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.