Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] gpudirect p2p?
From: Rolf vandeVaart (rvandevaart_at_[hidden])
Date: 2011-10-14 09:06:23

>-----Original Message-----
>From: users-bounces_at_[hidden] [mailto:users-bounces_at_[hidden]]
>On Behalf Of Chris Cooper
>Sent: Friday, October 14, 2011 1:28 AM
>To: users_at_[hidden]
>Subject: [OMPI users] gpudirect p2p?
>Are the recent peer to peer capabilities of cuda leveraged by Open MPI when
>eg you're running a rank per gpu on the one workstation?

Currently, no. I am actively working on adding that capability.

>It seems in my testing that I only get in the order of about 1GB/s as per
>whereas nvidia's simpleP2P test indicates ~6 GB/s.
>Also, I ran into a problem just trying to test. It seems you have to do
>cudaSetDevice/cuCtxCreate with the appropriate gpu id which I was wanting
>to derive from the rank. You don't however know the rank until after
>MPI_Init() and you need to initialize cuda before. Not sure if there's a
>standard way to do it? I have a workaround atm.

The recommended way is to put the GPU in exclusive mode first.

#nvidia-smi -c 1

Then, have this kind of snippet at the beginning of the program. (this is driver
API, probably should use runtime API)

res = cuInit(0);
if (CUDA_SUCCESS != res) {

if(CUDA_SUCCESS != cuDeviceGetCount(&cuDevCount)) {
for (device = 0; device < cuDevCount; device++) {
    if (CUDA_SUCCESS != (res = cuDeviceGet(&cuDev, device))) {
    if (CUDA_SUCCESS != cuCtxCreate(&ctx, 0, cuDev)) {
     /* Another process must have grabbed it. Go to the next one. */
    } else {

>users mailing list
This email message is for the sole use of the intended recipient(s) and may contain
confidential information. Any unauthorized review, use, disclosure or distribution
is prohibited. If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.