Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] mpi_leave_pinned is dangerous
From: Jens Glaser (jglaser_at_[hidden])
Date: 2012-11-07 19:21:23

I am replying to my own post, since no one else replied.

With the help of MVAPICH2 developer S. Potluri the problem was isolated and fixed. It was, as expected, due to the library not intercepting
the cudaHostAlloc() and cudaFreeHost() calls to register pinned memory, as would be required for the registration cache to work.
I replaced all of these calls with standard posix_memalign()/cudaHostRegister() calls in my code and the application now runs fine, with MVAPICH2
and with OpenMPI, and with registration cache enabled.

It would be desirable to have both libraries intercept the call to cudaHostAlloc/cudaFreeHost() (I assume OpenMPI 1.7 will have some level of cuda support), because
otherwise applications using GPUDirect are not guaranteed to work correctly with them, that is, they will exhibit undefined behavior.


On Nov 3, 2012, at 10:41 PM, Jens Glaser wrote:

> Hi,
> I am working on a CUDA/MPI application. It uses page-locked host buffers allocated with cudaHostAlloc(...,cudaHostAllocDefault), to which data from the device is copied before calling MPI.
> The application, a particle simulation, reproducibly crashed or produced undefined behavior at large particle numbers, and I could not explain why this happened.
> After some considerable debugging time (trying two different MPI libraries, MVAPICH2 1.9a and OpenMPI 1.6.1) I discovered openmpi's mpi_leave_pinned parameter.
> Setting mpi_leave_pinned to 0 solved my problem, the crash did not occur again! So far, excellent!
> I do have a request, however. After looking at the output of
> $ ompi_info --param mpi all
> I get
> MCA mpi: parameter "mpi_leave_pinned" (current value: <-1>, data source: default
> value)
> Whether to use the "leave pinned" protocol or not. Enabling this
> setting can help bandwidth performance when repeatedly sending and
> receiving large messages with the same buffers over RDMA-based networks
> (0 = do not use "leave pinned" protocol, 1 = use "leave pinned"
> protocol, -1 = allow network to choose at runtime).
> This seems to indicate that the default is that the network adapter chooses whether to enable or disable MPI. In my case, this default setting turns out to be disastrous.
> Also, the FAQ is somewhat ambiguous about this parameter and states that mpi_leave_pinned is off by default in one place, but that it is -1 (as above) at another place.
> Can anyone please explain to me the intricacies of this parameter, and what are the ramifications/benefits of having this particular default value?
> Thanks
> Jens