Hi,

 

This question was answered by Yevgeny Kliteynik from Mellanox on the developers list. The amount of registerable memory should be about twice the size of the physical memory because of the way physical memory is being registered with InfiniBand HCAs, not because of possible overcommitment. You can read the full description here:

 

http://www.open-mpi.org/community/lists/devel/2012/08/11417.php

 

Kind regards,

Hristo

--

Hristo Iliev, Ph.D. -- High Performance Computing

RWTH Aachen University, Center for Computing and Communication

Rechen- und Kommunikationszentrum der RWTH Aachen

Seffenter Weg 23,  D 52074  Aachen (Germany)

 

From: users-bounces@open-mpi.org [mailto:users-bounces@open-mpi.org] On Behalf Of Alan Wild
Sent: Thursday, October 18, 2012 5:47 AM
To: users@open-mpi.org
Subject: [OMPI users] openmpi-1.6.2 and registerable memory

 

I recently installed 1.6.2 on our cluster only to be introduced to the new warning messages concerning registerable memory and physical memory.  OpenMPI is indicating:

 

  Registerable memory:     32768 MiB
  Total memory:            48434 MiB

Which is clearly less than the "3/4 total memory" that produces the warning.  However, our systems 1) have swap completely disabled and 2) we've set the Linux kernel's vm behavior to disable overcommits.  (i.e.  /proc/sys/vm/overcommit_memory == 2).  So I'm not sure the guidance of setting Registerable memory to twice physical memory makes sense.  Worse still, I don't believe I can increase the log_num_mtt (or log_mtts_per_seg) as the any increase in these values would push cause registerable memory to double (and exceed total memory).

 

OR... am I misunderstanding the situation?  (Maybe it would be okay to have more registerable memory if the drivers will properly handle the failed malloc once they try and allocated memory beynd the physical memory).

 

So, in light of our vm and swap setting, would it still be appropriate to increase log_num_mtt?  If not, can we at least get a setting to suppress the warning message or (can the 3/4 threshold be lowered slightly perhaps 67% of total memory)?

 

Changing the vm or swap behavior is probably out of the question on our systems.  Our system stability improved dramatically when we went to these settings (over the Linux default) as our systems would never OOM properly.

 

-Alan

--
alan@madllama.net http://humbleville.blogspot.com