Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Rolf vandeVaart (Rolf.Vandevaart_at_[hidden])
Date: 2007-08-27 15:10:39

We are running into a problem when running on one of our larger SMPs
using the latest Open MPI v1.2 branch. We are trying to run a job
with np=128 within a single node. We are seeing the following error:

"SM failed to send message due to shortage of shared memory."

We then increased the allowable maximum size of the shared segment to
2Gigabytes-1 which is the maximum allowed on 32-bit application. We
used the mca parameter to increase it as shown here.

-mca mpool_sm_max_size 2147483647

This allowed the program to run to completion. Therefore, we would
like to increase the default maximum from 512Mbytes to 2G-1 Gigabytes.
Does anyone have an objection to this change? Soon we are going to
have larger CPU counts and would like to increase the odds that things
work "out of the box" on these large SMPs.

On a side note, I did a quick comparison of the shared memory needs of
the old Sun ClusterTools to Open MPI and came up with this table.
                                         Open MPI
np Sun ClusterTools 6 current suggested
  2 20M 128M 128M
  4 20M 128M 128M
  8 22M 256M 256M
 16 27M 512M 512M
 32 48M 512M 1G
 64 133M 512M 2G-1
128 476M 512M 2G-1