Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Problem with HPL while using OpenMPI 1.3.3
From: Gus Correa (gus_at_[hidden])
Date: 2009-12-28 13:21:44


Hi Ilya

Did you recompile HPL with OpenMPI, or just launched the MPICH2
executable with the OpenMPI mpiexec?
You probably know this, but you cannot mix different MPIs at
compile and run time.

Also, the HPL maximum problem size (N) depends on how much
memory/RAM you have.
If you make N too big, the arrays don't fit in the RAM,
you get into memory paging, which is no good for MPI.
How much RAM do you have?
N=17920 would require about 3.2GB, if I did the math right.
A rule of thumb is maximum N = sqrt(0.8 * RAM_in_bytes / 8)
Have you tried smaller values (above 10000, but below 17920)?
For which N does it start to break?

The HPL TUNING file may help:
http://www.netlib.org/benchmark/hpl/tuning.html

Good luck.

My two cents,
Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------

ilya zelenchuk wrote:
> Good day, everyone!
>
> I have problem while running HPL benchmark with OPENMPI 1.3.3.
> When problem size (Ns) smaller 10000 - all is good. But when I set Ns
> to 17920 (for example) - I get errors:
>
> ===
> [ums1:05086] ../../ompi/datatype/datatype_pack.h:37
> Pointer 0xb27752c0 size 4032 is outside [0xb27752c0,0x10aeac8] for
> base ptr 0xb27752c0 count 1 and data
> [ums1:05086] Datatype 0x83a0618[] size 5735048 align 4 id 0 length 244 used 81
> true_lb 0 true_ub 1318295560 (true_extent 1318295560) lb 0 ub
> 1318295560 (extent 1318295560)
> nbElems 716881 loops 0 flags 102 (commited )-c-----GD--[---][---]
> contain MPI_DOUBLE
> --C---P-D--[ C ][FLT] MPI_DOUBLE count 8880 disp 0x0 (0) extent 8
> (size 71040)
> --C---P-D--[ C ][FLT] MPI_DOUBLE count 8880 disp 0x11800 (71680)
> extent 8 (size 71040)
> --C---P-D--[ C ][FLT] MPI_DOUBLE count 8880 disp 0x23000 (143360)
> extent 8 (size 71040)
> --C---P-D--[ C ][FLT] MPI_DOUBLE count 8880 disp 0x34800 (215040)
> extent 8 (size 71040)
> --C---P-D--[ C ][FLT] MPI_DOUBLE count 8880 disp 0x46000 (286720)
> extent 8 (size 71040)
> --C---P-D--[ C ][FLT] MPI_DOUBLE count 8880 disp 0x57800 (358400)
> extent 8 (size 71040)
> --C---P-D--[ C ][FLT] MPI_DOUBLE count 8880 disp 0x69000 (430080)
> extent 8 (size 71040)
> --C---P-D--[ C ][FLT] MPI_DOUBLE count 8880 disp 0x7a800 (501760)
> extent 8 (size 71040)
> --C---P-D--[ C ][FLT] MPI_DOUBLE count 8880 disp 0x8c000 (573440)
> extent 8 (size 71040)
> --C---P-D--[ C ][FLT] MPI_DOUBLE count 8880 disp 0x9d800 (645120)
> extent 8 (size 71040)
> --C---P-D--[ C ][FLT] MPI_DOUBLE count 8880 disp 0xaf000 (716800)
> extent 8 (size 71040)
> --C---P-D--[ C ][FLT] MPI_DOUBLE count 8880 disp 0xc0800 (788480)
> extent 8 (size 71040)
> --C---P-D--[ C ][FLT] MPI_DOUBLE count 8880 disp 0xd2000 (860160)
> extent 8 (size 71040)
> --C---P-D--[ C ][FLT] MPI_DOUBLE count 8880 disp 0xe3800 (931840)
> extent 8 (size 71040)
> --C---P-D--[ C ][FLT] MPI_DOUBLE count 8880 disp 0xf5000 (1003520)
> extent 8 (size 71040)
> --C---P-D--[ C ][FLT] MPI_DOUBLE count 8880 disp 0x106800
> (1075200) extent 8 (size 71040)
> --C---P-D--[ C ][FLT] MPI_DOUBLE count 8880 disp 0x118000
> (1146880) extent 8 (size 71040)
> --C---P-D--[ C ][FLT] MPI_DOUBLE count 8880 disp 0x129800
> (1218560) extent 8 (size 71040)
> ....
> ===
>
> Here is my HPL.dat:
>
> ===
> HPLinpack benchmark input file
> Innovative Computing Laboratory, University of Tennessee
> HPL.out output file name (if any)
> 6 device out (6=stdout,7=stderr,file)
> 1 # of problems sizes (N)
> 17920 Ns
> 1 # of NBs
> 80 NBs
> 0 PMAP process mapping (0=Row-,1=Column-major)
> 1 # of process grids (P x Q)
> 2 Ps
> 14 Qs
> 16.0 threshold
> 1 # of panel fact
> 2 PFACTs (0=left, 1=Crout, 2=Right)
> 1 # of recursive stopping criterium
> 4 NBMINs (>= 1)
> 1 # of panels in recursion
> 2 NDIVs
> 1 # of recursive panel fact.
> 2 RFACTs (0=left, 1=Crout, 2=Right)
> 1 # of broadcast
> 2 BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
> 1 # of lookahead depth
> 1 DEPTHs (>=0)
> 2 SWAP (0=bin-exch,1=long,2=mix)
> 64 swapping threshold
> 0 L1 in (0=transposed,1=no-transposed) form
> 0 U in (0=transposed,1=no-transposed) form
> 1 Equilibration (0=no,1=yes)
> 8 memory alignment in double (> 0)
> ===
>
> I've run HPL with this HPL.dat by using MPICH2 - work's well.
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users