I am working on a CUDA/MPI application. It uses page-locked host buffers allocated with cudaHostAlloc(...,cudaHostAllocDefault), to which data from the device is copied before calling MPI.
The application, a particle simulation, reproducibly crashed or produced undefined behavior at large particle numbers, and I could not explain why this happened.
After some considerable debugging time (trying two different MPI libraries, MVAPICH2 1.9a and OpenMPI 1.6.1) I discovered openmpi's mpi_leave_pinned parameter.
Setting mpi_leave_pinned to 0 solved my problem, the crash did not occur again! So far, excellent!
I do have a request, however. After looking at the output of
$ ompi_info --param mpi all
MCA mpi: parameter "mpi_leave_pinned" (current value: <-1>, data source: default
Whether to use the "leave pinned" protocol or not. Enabling this
setting can help bandwidth performance when repeatedly sending and
receiving large messages with the same buffers over RDMA-based networks
(0 = do not use "leave pinned" protocol, 1 = use "leave pinned"
protocol, -1 = allow network to choose at runtime).
This seems to indicate that the default is that the network adapter chooses whether to enable or disable MPI. In my case, this default setting turns out to be disastrous.
Also, the FAQ is somewhat ambiguous about this parameter and states that mpi_leave_pinned is off by default in one place, but that it is -1 (as above) at another place.
Can anyone please explain to me the intricacies of this parameter, and what are the ramifications/benefits of having this particular default value?