Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] GPUDirect v1 issues
From: Sebastian Rinke (s.rinke_at_[hidden])
Date: 2012-01-17 12:08:07


I use CUDA 4.0 with MVAPICH2 1.5.1p1 and Open MPI 1.4.2.

Attached you find a little test case which is based on the GPUDirect v1 test case (mpi_pinned.c).
In that program the sender splits a message into chunks and sends them separately to the receiver
which posts the corresponding recvs. It is a kind of pipelining.

In mpi_pinned.c:141 the offsets into the recv buffer are set.
For the correct offsets, i.e. increasing them, it blocks with Open MPI.

Using line 142 instead (offset = 0) works.

The tarball attached contains a Makefile where you will have to adjust

* CUDA_INC_DIR
* CUDA_LIB_DIR

Sebastian

On Jan 17, 2012, at 4:16 PM, Kenneth A. Lloyd wrote:

> Also, which version of MVAPICH2 did you use?
>
> I've been pouring over Rolf's OpenMPI CUDA RDMA 3 (using CUDA 4.1 r2) vis
> MVAPICH-GPU on a small 3 node cluster. These are wickedly interesting.
>
> Ken
> -----Original Message-----
> From: devel-bounces_at_[hidden] [mailto:devel-bounces_at_[hidden]] On
> Behalf Of Rolf vandeVaart
> Sent: Tuesday, January 17, 2012 7:54 AM
> To: Open MPI Developers
> Subject: Re: [OMPI devel] GPUDirect v1 issues
>
> I am not aware of any issues. Can you send me a test program and I can try
> it out?
> Which version of CUDA are you using?
>
> Rolf
>
>> -----Original Message-----
>> From: devel-bounces_at_[hidden] [mailto:devel-bounces_at_[hidden]]
>> On Behalf Of Sebastian Rinke
>> Sent: Tuesday, January 17, 2012 8:50 AM
>> To: Open MPI Developers
>> Subject: [OMPI devel] GPUDirect v1 issues
>>
>> Dear all,
>>
>> I'm using GPUDirect v1 with Open MPI 1.4.3 and experience blocking
>> MPI_SEND/RECV to block forever.
>>
>> For two subsequent MPI_RECV, it hangs if the recv buffer pointer of the
>> second recv points to somewhere, i.e. not at the beginning, in the recv
>> buffer (previously allocated with cudaMallocHost()).
>>
>> I tried the same with MVAPICH2 and did not see the problem.
>>
>> Does anybody know about issues with GPUDirect v1 using Open MPI?
>>
>> Thanks for your help,
>> Sebastian
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel