On Apr 13, 2011, at 14:48 , Rolf vandeVaart wrote:
> This work does not depend on GPU Direct. It is making use of the fact that one can malloc memory, register it with IB, and register it with CUDA via the new 4.0 API cuMemHostRegister API. Then one can copy device memory into this memory.
Wasn't that the point behind GPUDirect? To allow direct memory copy between the GPU and the network card without external intervention?