Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [slurm-dev] slurm-dev Memory accounting issues with mpirun (was Re: Open-MPI build of NAMD launched from srun over 20% slowed than with mpirun)
From: Christopher Samuel (samuel_at_[hidden])
Date: 2013-08-07 02:55:56


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 07/08/13 16:19, Christopher Samuel wrote:

> Anyone seen anything similar, or any ideas on what could be going
> on?

Sorry, this was with:

# ACCOUNTING
JobAcctGatherType=jobacct_gather/linux
JobAcctGatherFrequency=30

Since those initial tests we've started enforcing memory limits (the
system is not yet in full production) and found that this causes jobs
to get killed.

We tried the cgroups gathering method, but jobs still die with mpirun
and now the numbers don't seem to right for mpirun or srun either:

mpirun (killed):

[samuel_at_barcoo-test Mem]$ sacct -j 94564 -o JobID,MaxRSS,MaxVMSize
       JobID MaxRSS MaxVMSize
- ------------ ---------- ----------
94564
94564.batch -523362K 0
94564.0 394525K 0

srun:

[samuel_at_barcoo-test Mem]$ sacct -j 94565 -o JobID,MaxRSS,MaxVMSize
       JobID MaxRSS MaxVMSize
- ------------ ---------- ----------
94565
94565.batch 998K 0
94565.0 88663K 0

All the best,
Chris
- --
 Christopher Samuel Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: samuel_at_[hidden] Phone: +61 (0)3 903 55545
 http://www.vlsci.org.au/ http://twitter.com/vlsci

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlIB73wACgkQO2KABBYQAh+kwACfYnMbONcpxD2lsM5i4QDw5r93
KpMAn2hPUxMJ62u2gZIUGl5I0bQ6lllk
=jYrC
-----END PGP SIGNATURE-----