Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI devel] Still bothered / cannot run an application
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2012-07-13 20:02:49

On Jul 12, 2012, at 12:04 PM, Paul Kapinos wrote:

> a long time ago, I reported about an error in Open MPI:
> Well, in the 1.6 the behaviour has changed: the test case don't hang forever and block an InfiniBand interface, but seem to run through, and now this error message is printed:
> --------------------------------------------------------------------------
> The OpenFabrics (openib) BTL failed to register memory in the driver.
> Please check /var/log/messages or dmesg for driver specific failure
> reason.

We updated our mechanism, but accidentally left this warning message in (it has since been removed).

Here's what's happening: Mellanox changed the default amount of registered memory that is available -- they dramatically reduced it. We haven't gotten a good answer yet as to *why* this change was made.

You can change some kernel-level parameters to increase it again, and then OMPI should work fine. Here's an IBM article about it:

And here's some comments that Mellanox made on a ticket about this issue (including some corrections/clarifications to that IBM article):


Basically, what's happening is that OMPI is behaving badly when it runs out of registered memory. We have tried two things to make this better (i.e., still perform *correctly*, albeit at a lower performance level), and we're not sure yet whether they work properly.

1. When OMPI tries to register more memory for an RDMA message transaction and fails, it falls back to send-receive (where we already have pre-registered memory available to use). However, this can still end up hanging because of OMPI's "lazy connection" scheme -- where OMPI doesn't open IB connections between MPI processes until the first time each pair of processes communicate. So if OMPI runs out of registered memory and then tries to open a new IB connection to a new peer -- kaboom.

2. When OMPI starts it, it guesstimates how much memory can be registered and equally divides it between all the OMPI processes *in that job* on the same node. We had mixed reports of this working or not. I made a 1.6.x tarball with this fix in it, if you could give it a whirl (with the default low registered memory kernel parameters, to ensure that you can invoke the "out of registered memory" issue):
    Use the openmpi-1.6.1ticket3131r26612M.tar.bz2 tarball

#2 is the latest attempt to fix it, but we haven't had good testing of it. Could you give it a whirl and let us know what happens?

Jeff Squyres
For corporate legal information go to: