Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] openmpi-1.6.2 and registerable memory
From: Iliev, Hristo (iliev_at_[hidden])
Date: 2012-10-18 04:23:32


Hi,

 

This question was answered by Yevgeny Kliteynik from Mellanox on the
developers list. The amount of registerable memory should be about twice the
size of the physical memory because of the way physical memory is being
registered with InfiniBand HCAs, not because of possible overcommitment. You
can read the full description here:

 

http://www.open-mpi.org/community/lists/devel/2012/08/11417.php

 

Kind regards,

Hristo

--
Hristo Iliev, Ph.D. -- High Performance Computing
RWTH Aachen University, Center for Computing and Communication
Rechen- und Kommunikationszentrum der RWTH Aachen
Seffenter Weg 23,  D 52074  Aachen (Germany)
 
From: users-bounces_at_[hidden] [mailto:users-bounces_at_[hidden]] On
Behalf Of Alan Wild
Sent: Thursday, October 18, 2012 5:47 AM
To: users_at_[hidden]
Subject: [OMPI users] openmpi-1.6.2 and registerable memory
 
I recently installed 1.6.2 on our cluster only to be introduced to the new
warning messages concerning registerable memory and physical memory.
OpenMPI is indicating:
 
  Registerable memory:     32768 MiB
  Total memory:            48434 MiB
Which is clearly less than the "3/4 total memory" that produces the warning.
However, our systems 1) have swap completely disabled and 2) we've set the
Linux kernel's vm behavior to disable overcommits.  (i.e.
/proc/sys/vm/overcommit_memory == 2).  So I'm not sure the guidance of
setting Registerable memory to twice physical memory makes sense.  Worse
still, I don't believe I can increase the log_num_mtt (or log_mtts_per_seg)
as the any increase in these values would push cause registerable memory to
double (and exceed total memory).
 
OR... am I misunderstanding the situation?  (Maybe it would be okay to have
more registerable memory if the drivers will properly handle the failed
malloc once they try and allocated memory beynd the physical memory).
 
So, in light of our vm and swap setting, would it still be appropriate to
increase log_num_mtt?  If not, can we at least get a setting to suppress the
warning message or (can the 3/4 threshold be lowered slightly perhaps 67% of
total memory)?
 
Changing the vm or swap behavior is probably out of the question on our
systems.  Our system stability improved dramatically when we went to these
settings (over the Linux default) as our systems would never OOM properly.
 
-Alan
-- 
alan_at_[hidden] http://humbleville.blogspot.com




  • application/pkcs7-signature attachment: smime.p7s