Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Advices for parameter tuning for CUDA-aware MPI
From: Rolf vandeVaart (rvandevaart_at_[hidden])
Date: 2014-05-27 16:40:55


>-----Original Message-----
>From: users [mailto:users-bounces_at_[hidden]] On Behalf Of Maxime
>Boissonneault
>Sent: Tuesday, May 27, 2014 4:07 PM
>To: Open MPI Users
>Subject: Re: [OMPI users] Advices for parameter tuning for CUDA-aware MPI
>
>Answers inline too.
>>> 2) Is the absence of btl_openib_have_driver_gdr an indicator of
>>> something missing ?
>> Yes, that means that somehow the GPU Direct RDMA is not installed
>correctly. All that check does is make sure that the file
>/sys/kernel/mm/memory_peers/nv_mem/version exists. Does that exist?
>>
>It does not. There is no
>/sys/kernel/mm/memory_peers/
>
>>> 3) Are the default parameters, especially the rdma limits and such,
>>> optimal for our configuration ?
>> That is hard to say. GPU Direct RDMA does not work well when the GPU
>and IB card are not "close" on the system. Can you run "nvidia-smi topo -m"
>on your system?
>nvidia-smi topo -m
>gives me the error
>[mboisson_at_login-gpu01 ~]$ nvidia-smi topo -m Invalid combination of input
>arguments. Please run 'nvidia-smi -h' for help.
Sorry, my mistake. That may be a future feature.

>
>I could not find anything related to topology in the help. However, I can tell
>you the following which I believe to be true
>- GPU0 and GPU1 are on PCIe bus 0, socket 0
>- GPU2 and GPU3 are on PCIe bus 1, socket 0
>- GPU4 and GPU5 are on PCIe bus 2, socket 1
>- GPU6 and GPU7 are on PCIe bus 3, socket 1
>
>There is one IB card which I believe is on socket 0.
>
>
>I know that we do not have the Mellanox Ofed. We use the Linux RDMA from
>CentOS 6.5. However, should that completely disable GDR within a single
>node ? i.e. does GDR _have_ to go through IB ? I would assume that our lack
>of Mellanox OFED would result in no-GDR inter-node, but GDR intra-node.

Without Mellanox OFED, then GPU Direct RDMA is unavailable. However, the term GPU Direct is a somewhat overloaded term and I think that is where I was getting confused. GPU Direct (also known as CUDA IPC) will work between GPUs that do not cross a QPI connection. That means that I believe GPU0,1,2,3 should be able to use GPU Direct between them and GPU4,5,6,7 can also between them. In this case, this means that GPU memory does not need to get staged through host memory for transferring between the GPUs. With Open MPI, there is a mca parameter you can set that will allow you to see whether GPU Direct is being used between the GPUs.

--mca btl_smcuda_cuda_ipc_verbose 100

 Rolf

-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information. Any unauthorized review, use, disclosure or distribution
is prohibited. If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------