I use CUDA 4.0 with MVAPICH2 1.5.1p1 and Open MPI 1.4.2.
Attached you find a little test case which is based on the GPUDirect v1 test case (mpi_pinned.c).
In that program the sender splits a message into chunks and sends them separately to the receiver
which posts the corresponding recvs. It is a kind of pipelining.
In mpi_pinned.c:141 the offsets into the recv buffer are set.
For the correct offsets, i.e. increasing them, it blocks with Open MPI.
Using line 142 instead (offset = 0) works.
The tarball attached contains a Makefile where you will have to adjust
On Jan 17, 2012, at 4:16 PM, Kenneth A. Lloyd wrote:
> Also, which version of MVAPICH2 did you use?
> I've been pouring over Rolf's OpenMPI CUDA RDMA 3 (using CUDA 4.1 r2) vis
> MVAPICH-GPU on a small 3 node cluster. These are wickedly interesting.
> -----Original Message-----
> From: devel-bounces_at_[hidden] [mailto:devel-bounces_at_[hidden]] On
> Behalf Of Rolf vandeVaart
> Sent: Tuesday, January 17, 2012 7:54 AM
> To: Open MPI Developers
> Subject: Re: [OMPI devel] GPUDirect v1 issues
> I am not aware of any issues. Can you send me a test program and I can try
> it out?
> Which version of CUDA are you using?
>> -----Original Message-----
>> From: devel-bounces_at_[hidden] [mailto:devel-bounces_at_[hidden]]
>> On Behalf Of Sebastian Rinke
>> Sent: Tuesday, January 17, 2012 8:50 AM
>> To: Open MPI Developers
>> Subject: [OMPI devel] GPUDirect v1 issues
>> Dear all,
>> I'm using GPUDirect v1 with Open MPI 1.4.3 and experience blocking
>> MPI_SEND/RECV to block forever.
>> For two subsequent MPI_RECV, it hangs if the recv buffer pointer of the
>> second recv points to somewhere, i.e. not at the beginning, in the recv
>> buffer (previously allocated with cudaMallocHost()).
>> I tried the same with MVAPICH2 and did not see the problem.
>> Does anybody know about issues with GPUDirect v1 using Open MPI?
>> Thanks for your help,
>> devel mailing list