Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: [OMPI devel] Still bothered / cannot run an application
From: Paul Kapinos (kapinos_at_[hidden])
Date: 2012-07-12 12:04:07


(cross-post to 'users' and 'devel' mailing lists)

Dear Open MPI developer,
a long time ago, I reported about an error in Open MPI:
http://www.open-mpi.org/community/lists/users/2012/02/18565.php

Well, in the 1.6 the behaviour has changed: the test case don't hang forever and
block an InfiniBand interface, but seem to run through, and now this error
message is printed:
--------------------------------------------------------------------------
The OpenFabrics (openib) BTL failed to register memory in the driver.
Please check /var/log/messages or dmesg for driver specific failure
reason.
The failure occured here:

   Local host: mlx4_0
   Device: openib_reg_mr
   Function: Cannot allocate memory()
   Errno says:

You may need to consult with your system administrator to get this
problem fixed.
--------------------------------------------------------------------------

Looking into FAQ
http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages
deliver us no hint about what is bad. The locked memory is unlimited:
--------------------------------------------------------------------------
pk224850_at_linuxbdc02:~[502]$ cat /etc/security/limits.conf | grep memlock
# - memlock - max locked-in-memory address space (KB)
* hard memlock unlimited
* soft memlock unlimited
--------------------------------------------------------------------------

Could it still be an Open MPI issue? Are you interested in reproduce this?

Best,
Paul Kapinos

P.S: The same test with Intel MPI cannot run using DAPL, but run very fine opef
'ofa' (= native verbs as Open MPI use it). So I believe the problem is rooted in
the communication pattern of the program; it send very LARGE messages to a lot
of/all other processes. (The program perform an matrix transposition of a
distributed matrix).

-- 
Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
RWTH Aachen University, Center for Computing and Communication
Seffenter Weg 23,  D 52074  Aachen (Germany)
Tel: +49 241/80-24915