We have had some recent experience with this in an Open MPI 1.4.x
version and thought it would be useful to contribute to the discussion.
Please see below.
Jeff Squyres wrote:
> On Nov 29, 2010, at 6:25 PM, George Bosilca wrote:
>> The main problem is that openib require to pin memory pages in order to take advantage of RMA features. There is a major issues with these pinned pages and fork, leading to segmentation fault in some specific cases. However, we only pin the pages on the MPI calls related to data transfers. Therefore, if you call fork __before__ any other MPI data transfer function (but after MPI_Init as you use the process rank), your application should be safe.
> Note that Open MPI also pins some internal memory during MPI_INIT, but that memory is totally internal to libmpi, so you should be safe (i.e., you should never be able to find it and therefore never be able to try to touch it).
This is what we believe happened in our testing:
1. MPI_init allocated and pinned down some memory. This memory was 64
byte aligned and not page-aligned to 4096 bytes. So an allocation that
ideally should have resulted in 2 pages being pinned, actually had 3
pages pinned with lots of unused memory on the 3rd page.
2. A child process created via popen tried to allocate some memory
(perhaps a byproduct of popen execution itself) and was allocated memory
on that last page with lots of unused memory. When the child tried to
touch the allocation, there was seg fault.
We could reduce the probability of this happenning by changing the
alignment of MPI allocations to 4096 bytes. But since MPI allocations
are not sized to be multiple of page size, this isn't a foolproof method.
One way (agreed not ideal) to avoid the potential seg fault is to set
the MCA parameter btl_openib_want_fork_suppoort = 0. But then you are
"trusting" any child processes to not intentionally or as a result of a
bug, touch the memory regions that have been registered/pinned by the
>>> How can one be sure that the disabling the warning is ok? Could you please elaborate on what makes forks vulnerable? May be that will guide the developers to make an informed decision on whether to disable them or find another alternative.
>> No way to know at 100%. Now for an elaborate answer: Once upon a time ... The fork story is a long and boring one, we would all have preferred to never heard about it (believe me). A quick and compressed version can be found on the QLogic download page (http://filedownloads.qlogic.com/files/driver/70277/release_QLogicIB-Basic_4400_Rev_A.html).
> That's a good summary. The issue is with OFED itself, not with Open MPI.
> Note, too, that calling popen() should also be safe (even though we'll warn about it -- our atfork hook has no way of knowing whether you're calling system, popen, or something else).
Mercury Computer Systems, Inc. (http://www.mc.com)
This message is intended only for the designated recipient(s) and may
contain confidential or proprietary information of Mercury Computer
Systems, Inc. This message is solely intended to facilitate business
discussions and does not constitute an express or implied offer to sell
or purchase any products, services, or support. Any commitments must be
made in writing and signed by duly authorized representatives of each