Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI devel] Existing frameworks for remote device memory exclusive read/write
From: George Bosilca (bosilca_at_[hidden])
Date: 2012-07-23 10:56:38


A while back we investigated the potential of a memcpy module in the OPAL layer. We had some proof of concept, but finally didn't went forward due to lack of resources. However, we the skeleton of the code is still in the trunk (in opal/mca/memcpy). While I don't think it will cover all the cases expressed in your email due to it's synchronous nature, it can be a first step.

In Open MPI, we avoid using memcpy directly. Instead, we use the convertor mechanism to deal with all memory to memory type of operations (as it hide the complexities of managing complex memory layout as defined by the MPI datatypes). Few weeks ago, Rolf (our NVIDIA guru), applied a patch allowing asynchronous memcpy in the OB1 PML for the last version of CUDA. Dig in the code looking for the HAVE_CUDA define to see the code he used to achieve asynchronous memcpy.


On Jul 21, 2012, at 00:27 , Dmitry N. Mikushin wrote:

> Dear OpenMPI developers,
> My question is not directly related to OpenMPI, but might be related to internal project kitchen and your wide experiences.
> Say, there is a need to implement a transparent read/write of PCI-Express device internal memory from the host system. It is allowed to use only software capabilities of PCI-E device, which can memcpy synchronously and asynchronously in both directions. Memcpy can be initiated both by host and device. Host is required to implement its device memory read/write in critical sections: no PCI-E code could be using the same memory, while it is in operation.
> Question: could you please point related projects/subsystems, which code could be reused to implement the described functionality? We are mostly interested in ones implementing multiple strategies of memory synchronization, since there could be quite some, depending on typical memory access patterns, for example. This subsystem is necessary for our project, however not its primary goal, that's why we would like to borrow existing things in best possible way.
> Thanks and best regards,
> - Dima.
> _______________________________________________
> devel mailing list
> devel_at_[hidden]