# Open MPI Development Mailing List Archives

 |   Home   |   Support   |   FAQ   |   all Development mailing list
 Subject: Re: [OMPI devel] Reviewing MPI_Dims_create From: Christoph Niethammer (niethammer_at_[hidden]) Date: 2014-02-10 20:29:47 Next message: Jeff Squyres (jsquyres): "[OMPI devel] oshmem test suite" Previous message: George Bosilca: "Re: [OMPI devel] RFC: optimize probe in ob1" In reply to: Andreas Schäfer: "Re: [OMPI devel] Reviewing MPI_Dims_create" Next in thread: Andreas Schäfer: "Re: [OMPI devel] Reviewing MPI_Dims_create" Reply: Andreas Schäfer: "Re: [OMPI devel] Reviewing MPI_Dims_create" Hi Andreas, As mentioned in my former mail I did not touch the factorization code. But to figure out if a number n is *not* a prime number it is sufficient to check up to \sqrt(n). Proof: let n = p*q with q > \sqrt{n} --> p < \sqrt(n) So we have already found factor p before reaching \sqrt(n) and by this n is no prime any more and no need for further checks. ;) The mentioned factorization may indeed include one factor which is larger than \sqrt(n). :) Proof that at least one prime factor can be larger than \sqrt(n) example: 6 = 2*3 \sqrt(6) = 2.4494897427832... < 3 Q.E.D. Proof that no more than one factor can be larger than \sqrt(n): let n = \prod_{i=0}^K p_i with p_i \in N and K > 2 and assume w.l.o.g. p_0 > \sqrt(n) and p_1 > \sqrt(n) --> 1 > \prod_{i=2}^K p_i which is a contradiction as all p_i \in N. Q.E.D. So your idea is still applicable with not much effort and we only need prime factors up to sqrt(n) in the factorizer code for an additional optimization. :) First search all K' factors p_i < \sqrt(n). If then n \ne \prod_{i=0}^{K'} p_i we should be sure that p_{K'+1} = n / \prod_{i=0}^{K'} p_i is a prime. No complication with counts IMHO. I leave this without patch as it is already 2:30 in the morning. :P Regards Christoph -- Christoph Niethammer High Performance Computing Center Stuttgart (HLRS) Nobelstrasse 19 70569 Stuttgart Tel: ++49(0)711-685-87203 email: niethammer_at_[hidden] http://www.hlrs.de/people/niethammer ----- UrsprÃ¼ngliche Mail ----- Von: "Andreas SchÃ¤fer" An: "Open MPI Developers" Gesendet: Montag, 10. Februar 2014 23:24:24 Betreff: Re: [OMPI devel] Reviewing MPI_Dims_create Christoph- your patch has the same problem as my original patch: indeed there may be a prime factor p of n with p > sqrt(n). What's important is that there may only be at most one. I've submitted an updated patch (see my previous mail) which catches this special case. Best -Andreas On 19:30 Mon 10 Feb , Christoph Niethammer wrote: > Hello, > > I noticed some effort in improving the scalability of > MPI_Dims_create(int nnodes, int ndims, int dims[]) > Unfortunately there were some issues with the first attempt (r30539 and r30540) which were reverted. > > So I decided to give it a short review based on r30606 > https://svn.open-mpi.org/trac/ompi/browser/trunk/ompi/mpi/c/dims_create.c?rev=30606 > > > 1.) freeprocs is initialized to be nnodes and the subsequent divisions of freeprocs have all positive integers as divisor. > So IMHO it would make more sense to check if nnodes > 0 in the MPI_PARAM_CHECK section at the begin instead of the following (see patch 0001): > > 99 if (freeprocs < 1) { > 100 return OMPI_ERRHANDLER_INVOKE(MPI_COMM_WORLD, MPI_ERR_DIMS, > 101 FUNC_NAME); > 102 } > > > 2.) I rewrote the algorithm stopping at sqrt(n) in getprimes(int num, int *nprimes, int **pprimes) > which makes mathematically more sens (as the largest prime factor of any number n cannot exceed \sqrt{n}) - and should produce the right result. ;) > (see patch 0002) > Here the improvements: > > module load mpi/openmpi/trunk-gnu.4.7.3 > $./mpi-dims-old 1000000 > time used for MPI_Dims_create(1000000, 3, {}): 8.104007 > module swap mpi/openmpi/trunk-gnu.4.7.3 mpi/openmpi/trunk-gnu.4.7.3-testing >$ ./mpi-dims-new 1000000 > time used for MPI_Dims_create(1000000, 3, {}): 0.060400 > > > 3.) Memory allocation for the list of prime numbers may be reduced up to a factor of ~6 for 1mio nodes using the result from Dusart 1999 [1]: > \pi(x) < x/ln(x)(1+1.2762/ln(x)) for x > 1 > Unfortunately this saves us only 1.6 MB per process for 1mio nodes as reported by tcmalloc/pprof on a test program - but it may sum up with fatter nodes. :P > > $pprof --base=$PWD/primes-old.0001.heap a.out primes-new.0002.heap > (pprof) top > Total: -1.6 MB > 0.3 -18.8% -18.8% 0.3 -18.8% getprimes2 > 0.0 -0.0% -18.8% -1.6 100.0% __libc_start_main > 0.0 -0.0% -18.8% -1.6 100.0% main > -1.9 118.8% 100.0% -1.9 118.8% getprimes > > Find attached patch for it in 0003. > > > If there are no issues I would like to commit this to trunk for further testing (+cmr for 1.7.5?) end of this week. > > Best regards > Christoph > > [1] http://www.ams.org/journals/mcom/1999-68-225/S0025-5718-99-01037-6/home.html > > > > -- > > Christoph Niethammer > High Performance Computing Center Stuttgart (HLRS) > Nobelstrasse 19 > 70569 Stuttgart > > Tel: ++49(0)711-685-87203 > email: niethammer_at_[hidden] > http://www.hlrs.de/people/niethammer > From e3292b90cac42fad80ed27a555419002ed61ab66 Mon Sep 17 00:00:00 2001 > From: Christoph Niethammer > Date: Mon, 10 Feb 2014 16:44:03 +0100 > Subject: [PATCH 1/3] Move parameter check into appropriate code section at the > begin. > > --- > ompi/mpi/c/dims_create.c | 11 ++++++----- > 1 file changed, 6 insertions(+), 5 deletions(-) > > diff --git a/ompi/mpi/c/dims_create.c b/ompi/mpi/c/dims_create.c > index d2c3858..3d0792f 100644 > --- a/ompi/mpi/c/dims_create.c > +++ b/ompi/mpi/c/dims_create.c > @@ -71,6 +71,11 @@ int MPI_Dims_create(int nnodes, int ndims, int dims[]) > return OMPI_ERRHANDLER_INVOKE (MPI_COMM_WORLD, > MPI_ERR_DIMS, FUNC_NAME); > } > + > + if (1 > nnodes) { > + return OMPI_ERRHANDLER_INVOKE (MPI_COMM_WORLD, > + MPI_ERR_DIMS, FUNC_NAME); > + } > } > > /* Get # of free-to-be-assigned processes and # of free dimensions */ > @@ -95,11 +100,7 @@ int MPI_Dims_create(int nnodes, int ndims, int dims[]) > FUNC_NAME); > } > > - if (freeprocs < 1) { > - return OMPI_ERRHANDLER_INVOKE(MPI_COMM_WORLD, MPI_ERR_DIMS, > - FUNC_NAME); > - } > - else if (freeprocs == 1) { > + if (freeprocs == 1) { > for (i = 0; i < ndims; ++i, ++dims) { > if (*dims == 0) { > *dims = 1; > -- > 1.8.3.2 > > From bc862c47ef8d581a8f6735c51983d6c9eeb95dfd Mon Sep 17 00:00:00 2001 > From: Christoph Niethammer > Date: Mon, 10 Feb 2014 18:50:51 +0100 > Subject: [PATCH 2/3] Speeding up detection of prime numbers using the fact > that the largest prime factor of any number n cannot exceed \sqrt{n}. > > --- > ompi/mpi/c/dims_create.c | 9 ++++++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > > diff --git a/ompi/mpi/c/dims_create.c b/ompi/mpi/c/dims_create.c > index 3d0792f..1c1c381 100644 > --- a/ompi/mpi/c/dims_create.c > +++ b/ompi/mpi/c/dims_create.c > @@ -5,7 +5,7 @@ > * Copyright (c) 2004-2005 The University of Tennessee and The University > * of Tennessee Research Foundation. All rights > * reserved. > - * Copyright (c) 2004-2005 High Performance Computing Center Stuttgart, > + * Copyright (c) 2004-2014 High Performance Computing Center Stuttgart, > * University of Stuttgart. All rights reserved. > * Copyright (c) 2004-2005 The Regents of the University of California. > * All rights reserved. > @@ -20,6 +20,8 @@ > > #include "ompi_config.h" > > +#include > + > #include "ompi/mpi/c/bindings.h" > #include "ompi/runtime/params.h" > #include "ompi/communicator/communicator.h" > @@ -293,13 +295,14 @@ getprimes(int num, int *pnprime, int **pprimes) { > primes[i++] = 2; > > for (n = 3; n <= num; n += 2) { > - for (j = 1; j < i; ++j) { > + double nsqrt = sqrt((double) n); > + for(j = 0; (double) primes[j] <= nsqrt; j++) { > if ((n % primes[j]) == 0) { > break; > } > } > > - if (j == i) { > + if (primes[j] > nsqrt) { > if (i >= size) { > return MPI_ERR_DIMS; > } > -- > 1.8.3.2 > > From a012206cfbf7b8b22cef4cc5b811384e5e3ac32c Mon Sep 17 00:00:00 2001 > From: Christoph Niethammer > Date: Mon, 10 Feb 2014 19:02:03 +0100 > Subject: [PATCH 3/3] Reduce memory usage by a better approximation for the > upper limit of the number of primes. > > --- > ompi/mpi/c/dims_create.c | 12 ++++++++++-- > 1 file changed, 10 insertions(+), 2 deletions(-) > > diff --git a/ompi/mpi/c/dims_create.c b/ompi/mpi/c/dims_create.c > index 1c1c381..8dd3144 100644 > --- a/ompi/mpi/c/dims_create.c > +++ b/ompi/mpi/c/dims_create.c > @@ -281,9 +281,17 @@ getprimes(int num, int *pnprime, int **pprimes) { > int n; > int size; > int *primes; > + double lognum; > > - /* Allocate the array of primes */ > - size = (num / 2) + 1; > + /* Allocate the array of primes > + * see Pierre Dusart, Math. Comp. 68 (1999), no. 225, 411???415.*/ > + lognum = log(num); > + if(num > 1) { > + size = ceil(num/lognum * (1+1.2762/lognum)); > + } > + else { > + size = 1; > + } > primes = (int *) malloc((unsigned) size * sizeof(int)); > if (NULL == primes) { > return MPI_ERR_NO_MEM; > -- > 1.8.3.2 > > _______________________________________________ > devel mailing list > devel_at_[hidden] > http://www.open-mpi.org/mailman/listinfo.cgi/devel -- ========================================================== Andreas SchÃ¤fer HPC and Grid Computing Chair of Computer Science 3 Friedrich-Alexander-UniversitÃ¤t Erlangen-NÃ¼rnberg, Germany +49 9131 85-27910 PGP/GPG key via keyserver http://www.libgeodecomp.org ========================================================== (\___/) (+'.'+) (")_(") This is Bunny. Copy and paste Bunny into your signature to help him gain world domination! _______________________________________________ devel mailing list devel_at_[hidden] http://www.open-mpi.org/mailman/listinfo.cgi/devel  Next message: Jeff Squyres (jsquyres): "[OMPI devel] oshmem test suite" Previous message: George Bosilca: "Re: [OMPI devel] RFC: optimize probe in ob1" In reply to: Andreas Schäfer: "Re: [OMPI devel] Reviewing MPI_Dims_create" Next in thread: Andreas Schäfer: "Re: [OMPI devel] Reviewing MPI_Dims_create" Reply: Andreas Schäfer: "Re: [OMPI devel] Reviewing MPI_Dims_create"