Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Reviewing MPI_Dims_create
From: Andreas Schäfer (gentryx_at_[hidden])
Date: 2014-02-11 00:24:56


OK, so we were thinking the same thing :-) The optimization you
mention below is the same I've used in my updated patch.

On 02:29 Tue 11 Feb , Christoph Niethammer wrote:
> As mentioned in my former mail I did not touch the factorization
> code.
> But to figure out if a number n is *not* a prime number it is sufficient to check up to \sqrt(n).
> Proof:
> let n = p*q with q > \sqrt{n}
> --> p < \sqrt(n)
> So we have already found factor p before reaching \sqrt(n) and by this n is no prime any more and no need for further checks. ;)
>
>
> The mentioned factorization may indeed include one factor which is larger than \sqrt(n). :)
>
> Proof that at least one prime factor can be larger than \sqrt(n) example:
> 6 = 2*3
> \sqrt(6) = 2.4494897427832... < 3 Q.E.D.
>
>
> Proof that no more than one factor can be larger than \sqrt(n):
> let n = \prod_{i=0}^K p_i with p_i \in N and K > 2
> and assume w.l.o.g. p_0 > \sqrt(n) and p_1 > \sqrt(n)
> --> 1 > \prod_{i=2}^K p_i
> which is a contradiction as all p_i \in N. Q.E.D.
>
>
> So your idea is still applicable with not much effort and we only need prime factors up to sqrt(n) in the factorizer code for an additional optimization. :)
>
> First search all K' factors p_i < \sqrt(n). If then n \ne \prod_{i=0}^{K'} p_i we should be sure that p_{K'+1} = n / \prod_{i=0}^{K'} p_i is a prime. No complication with counts IMHO. I leave this without patch as it is already 2:30 in the morning. :P
>
> Regards
> Christoph
>
> --
>
> Christoph Niethammer
> High Performance Computing Center Stuttgart (HLRS)
> Nobelstrasse 19
> 70569 Stuttgart
>
> Tel: ++49(0)711-685-87203
> email: niethammer_at_[hidden]
> http://www.hlrs.de/people/niethammer
>
> ----- Ursprüngliche Mail -----
> Von: "Andreas Schäfer" <gentryx_at_[hidden]>
> An: "Open MPI Developers" <devel_at_[hidden]>
> Gesendet: Montag, 10. Februar 2014 23:24:24
> Betreff: Re: [OMPI devel] Reviewing MPI_Dims_create
>
> Christoph-
>
> your patch has the same problem as my original patch: indeed there may
> be a prime factor p of n with p > sqrt(n). What's important is that
> there may only be at most one. I've submitted an updated patch (see my
> previous mail) which catches this special case.
>
> Best
> -Andreas
>
>
> On 19:30 Mon 10 Feb , Christoph Niethammer wrote:
> > Hello,
> >
> > I noticed some effort in improving the scalability of
> > MPI_Dims_create(int nnodes, int ndims, int dims[])
> > Unfortunately there were some issues with the first attempt (r30539 and r30540) which were reverted.
> >
> > So I decided to give it a short review based on r30606
> > https://svn.open-mpi.org/trac/ompi/browser/trunk/ompi/mpi/c/dims_create.c?rev=30606
> >
> >
> > 1.) freeprocs is initialized to be nnodes and the subsequent divisions of freeprocs have all positive integers as divisor.
> > So IMHO it would make more sense to check if nnodes > 0 in the MPI_PARAM_CHECK section at the begin instead of the following (see patch 0001):
> >
> > 99 if (freeprocs < 1) {
> > 100 return OMPI_ERRHANDLER_INVOKE(MPI_COMM_WORLD, MPI_ERR_DIMS,
> > 101 FUNC_NAME);
> > 102 }
> >
> >
> > 2.) I rewrote the algorithm stopping at sqrt(n) in getprimes(int num, int *nprimes, int **pprimes)
> > which makes mathematically more sens (as the largest prime factor of any number n cannot exceed \sqrt{n}) - and should produce the right result. ;)
> > (see patch 0002)
> > Here the improvements:
> >
> > module load mpi/openmpi/trunk-gnu.4.7.3
> > $ ./mpi-dims-old 1000000
> > time used for MPI_Dims_create(1000000, 3, {}): 8.104007
> > module swap mpi/openmpi/trunk-gnu.4.7.3 mpi/openmpi/trunk-gnu.4.7.3-testing
> > $ ./mpi-dims-new 1000000
> > time used for MPI_Dims_create(1000000, 3, {}): 0.060400
> >
> >
> > 3.) Memory allocation for the list of prime numbers may be reduced up to a factor of ~6 for 1mio nodes using the result from Dusart 1999 [1]:
> > \pi(x) < x/ln(x)(1+1.2762/ln(x)) for x > 1
> > Unfortunately this saves us only 1.6 MB per process for 1mio nodes as reported by tcmalloc/pprof on a test program - but it may sum up with fatter nodes. :P
> >
> > $ pprof --base=$PWD/primes-old.0001.heap a.out primes-new.0002.heap
> > (pprof) top
> > Total: -1.6 MB
> > 0.3 -18.8% -18.8% 0.3 -18.8% getprimes2
> > 0.0 -0.0% -18.8% -1.6 100.0% __libc_start_main
> > 0.0 -0.0% -18.8% -1.6 100.0% main
> > -1.9 118.8% 100.0% -1.9 118.8% getprimes
> >
> > Find attached patch for it in 0003.
> >
> >
> > If there are no issues I would like to commit this to trunk for further testing (+cmr for 1.7.5?) end of this week.
> >
> > Best regards
> > Christoph
> >
> > [1] http://www.ams.org/journals/mcom/1999-68-225/S0025-5718-99-01037-6/home.html
> >
> >
> >
> > --
> >
> > Christoph Niethammer
> > High Performance Computing Center Stuttgart (HLRS)
> > Nobelstrasse 19
> > 70569 Stuttgart
> >
> > Tel: ++49(0)711-685-87203
> > email: niethammer_at_[hidden]
> > http://www.hlrs.de/people/niethammer
>
> > From e3292b90cac42fad80ed27a555419002ed61ab66 Mon Sep 17 00:00:00 2001
> > From: Christoph Niethammer <christoph.niethammer_at_[hidden]>
> > Date: Mon, 10 Feb 2014 16:44:03 +0100
> > Subject: [PATCH 1/3] Move parameter check into appropriate code section at the
> > begin.
> >
> > ---
> > ompi/mpi/c/dims_create.c | 11 ++++++-----
> > 1 file changed, 6 insertions(+), 5 deletions(-)
> >
> > diff --git a/ompi/mpi/c/dims_create.c b/ompi/mpi/c/dims_create.c
> > index d2c3858..3d0792f 100644
> > --- a/ompi/mpi/c/dims_create.c
> > +++ b/ompi/mpi/c/dims_create.c
> > @@ -71,6 +71,11 @@ int MPI_Dims_create(int nnodes, int ndims, int dims[])
> > return OMPI_ERRHANDLER_INVOKE (MPI_COMM_WORLD,
> > MPI_ERR_DIMS, FUNC_NAME);
> > }
> > +
> > + if (1 > nnodes) {
> > + return OMPI_ERRHANDLER_INVOKE (MPI_COMM_WORLD,
> > + MPI_ERR_DIMS, FUNC_NAME);
> > + }
> > }
> >
> > /* Get # of free-to-be-assigned processes and # of free dimensions */
> > @@ -95,11 +100,7 @@ int MPI_Dims_create(int nnodes, int ndims, int dims[])
> > FUNC_NAME);
> > }
> >
> > - if (freeprocs < 1) {
> > - return OMPI_ERRHANDLER_INVOKE(MPI_COMM_WORLD, MPI_ERR_DIMS,
> > - FUNC_NAME);
> > - }
> > - else if (freeprocs == 1) {
> > + if (freeprocs == 1) {
> > for (i = 0; i < ndims; ++i, ++dims) {
> > if (*dims == 0) {
> > *dims = 1;
> > --
> > 1.8.3.2
> >
>
> > From bc862c47ef8d581a8f6735c51983d6c9eeb95dfd Mon Sep 17 00:00:00 2001
> > From: Christoph Niethammer <christoph.niethammer_at_[hidden]>
> > Date: Mon, 10 Feb 2014 18:50:51 +0100
> > Subject: [PATCH 2/3] Speeding up detection of prime numbers using the fact
> > that the largest prime factor of any number n cannot exceed \sqrt{n}.
> >
> > ---
> > ompi/mpi/c/dims_create.c | 9 ++++++---
> > 1 file changed, 6 insertions(+), 3 deletions(-)
> >
> > diff --git a/ompi/mpi/c/dims_create.c b/ompi/mpi/c/dims_create.c
> > index 3d0792f..1c1c381 100644
> > --- a/ompi/mpi/c/dims_create.c
> > +++ b/ompi/mpi/c/dims_create.c
> > @@ -5,7 +5,7 @@
> > * Copyright (c) 2004-2005 The University of Tennessee and The University
> > * of Tennessee Research Foundation. All rights
> > * reserved.
> > - * Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
> > + * Copyright (c) 2004-2014 High Performance Computing Center Stuttgart,
> > * University of Stuttgart. All rights reserved.
> > * Copyright (c) 2004-2005 The Regents of the University of California.
> > * All rights reserved.
> > @@ -20,6 +20,8 @@
> >
> > #include "ompi_config.h"
> >
> > +#include <math.h>
> > +
> > #include "ompi/mpi/c/bindings.h"
> > #include "ompi/runtime/params.h"
> > #include "ompi/communicator/communicator.h"
> > @@ -293,13 +295,14 @@ getprimes(int num, int *pnprime, int **pprimes) {
> > primes[i++] = 2;
> >
> > for (n = 3; n <= num; n += 2) {
> > - for (j = 1; j < i; ++j) {
> > + double nsqrt = sqrt((double) n);
> > + for(j = 0; (double) primes[j] <= nsqrt; j++) {
> > if ((n % primes[j]) == 0) {
> > break;
> > }
> > }
> >
> > - if (j == i) {
> > + if (primes[j] > nsqrt) {
> > if (i >= size) {
> > return MPI_ERR_DIMS;
> > }
> > --
> > 1.8.3.2
> >
>
> > From a012206cfbf7b8b22cef4cc5b811384e5e3ac32c Mon Sep 17 00:00:00 2001
> > From: Christoph Niethammer <christoph.niethammer_at_[hidden]>
> > Date: Mon, 10 Feb 2014 19:02:03 +0100
> > Subject: [PATCH 3/3] Reduce memory usage by a better approximation for the
> > upper limit of the number of primes.
> >
> > ---
> > ompi/mpi/c/dims_create.c | 12 ++++++++++--
> > 1 file changed, 10 insertions(+), 2 deletions(-)
> >
> > diff --git a/ompi/mpi/c/dims_create.c b/ompi/mpi/c/dims_create.c
> > index 1c1c381..8dd3144 100644
> > --- a/ompi/mpi/c/dims_create.c
> > +++ b/ompi/mpi/c/dims_create.c
> > @@ -281,9 +281,17 @@ getprimes(int num, int *pnprime, int **pprimes) {
> > int n;
> > int size;
> > int *primes;
> > + double lognum;
> >
> > - /* Allocate the array of primes */
> > - size = (num / 2) + 1;
> > + /* Allocate the array of primes
> > + * see Pierre Dusart, Math. Comp. 68 (1999), no. 225, 411???415.*/
> > + lognum = log(num);
> > + if(num > 1) {
> > + size = ceil(num/lognum * (1+1.2762/lognum));
> > + }
> > + else {
> > + size = 1;
> > + }
> > primes = (int *) malloc((unsigned) size * sizeof(int));
> > if (NULL == primes) {
> > return MPI_ERR_NO_MEM;
> > --
> > 1.8.3.2
> >
>
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> --
> ==========================================================
> Andreas Schäfer
> HPC and Grid Computing
> Chair of Computer Science 3
> Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
> +49 9131 85-27910
> PGP/GPG key via keyserver
> http://www.libgeodecomp.org
> ==========================================================
>
> (\___/)
> (+'.'+)
> (")_(")
> This is Bunny. Copy and paste Bunny into your
> signature to help him gain world domination!
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
==========================================================
Andreas Schäfer
HPC and Grid Computing
Chair of Computer Science 3
Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
+49 9131 85-27910
PGP/GPG key via keyserver
http://www.libgeodecomp.org
==========================================================
(\___/)
(+'.'+)
(")_(")
This is Bunny. Copy and paste Bunny into your
signature to help him gain world domination!