Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Segmentation fault (11) Address not mapped (1)
From: Mehdi Bozzo-Rey (mbozzore_at_[hidden])
Date: 2008-08-15 15:04:41


Hi,

 

Can you give more info about the compilation steps, I just recompiled it
(using the internal stuff except for fftw) and was able to run an
example (output below). Did I miss something ?

 

I recompiled / ran on a Platform OCS 5 cluster (based on RHEL 5), with
IB support (OFED)

 

Partial ompi_info :

 

                Open MPI: 1.2.6

   Open MPI SVN revision: r17946

                Open RTE: 1.2.6

   Open RTE SVN revision: r17946

                    OPAL: 1.2.6

       OPAL SVN revision: r17946

                  Prefix: /home/mbozzore/openmpi

 Configured architecture: x86_64-unknown-linux-gnu

           Configured by: mbozzore

           Configured on: Mon Aug 11 00:29:15 EDT 2008

          Configure host: tyan04.lsf.platform.com

                Built by: mbozzore

                Built on: Mon Aug 11 00:33:54 EDT 2008

              Built host: tyan04.lsf.platform.com

              C bindings: yes

            C++ bindings: yes

      Fortran77 bindings: yes (all)

      Fortran90 bindings: yes

 Fortran90 bindings size: small

              C compiler: gcc

     C compiler absolute: /usr/bin/gcc

            C++ compiler: g++

   C++ compiler absolute: /usr/bin/g++

      Fortran77 compiler: gfortran

  Fortran77 compiler abs: /usr/bin/gfortran

      Fortran90 compiler: gfortran

  Fortran90 compiler abs: /usr/bin/gfortran

 

 

 

 [mbozzore_at_tyan04 tests]$ mpirun -np 4 --machinefile ./hosts -x
LD_LIBRARY_PATH --mca btl openib,self ../bin/pw.x < scf.in

 

     Program PWSCF v.4.0.1 starts ...

     Today is 15Aug2008 at 14:51:18

 

     Parallel version (MPI)

 

     Number of processors in use: 4

     R & G space division: proc/pool = 4

 

     For Norm-Conserving or Ultrasoft (Vanderbilt) Pseudopotentials or
PAW

 

     Current dimensions of program pwscf are:

     Max number of different atomic species (ntypx) = 10

     Max number of k-points (npk) = 40000

     Max angular momentum in pseudopotentials (lmaxx) = 3

 

     Iterative solution of the eigenvalue problem

 

     a parallel distributed memory algorithm will be used,

     eigenstates matrixes will be distributed block like on

     ortho sub-group = 2* 2 procs

 

 

     Planes per process (thick) : nr3 = 16 npp = 4 ncplane = 256

 

     Proc/ planes cols G planes cols G columns G

     Pool (dense grid) (smooth grid) (wavefct grid)

       1 4 41 366 4 41 366 13 70

       2 4 41 366 4 41 366 14 71

       3 4 40 362 4 40 362 14 71

       4 4 41 365 4 41 365 14 71

     tot 16 163 1459 16 163 1459 55 283

 

 

 

     bravais-lattice index = 2

     lattice parameter (a_0) = 10.2000 a.u.

     unit-cell volume = 265.3020 (a.u.)^3

     number of atoms/cell = 2

     number of atomic types = 1

     number of electrons = 8.00

     number of Kohn-Sham states= 4

     kinetic-energy cutoff = 12.0000 Ry

     charge density cutoff = 48.0000 Ry

     convergence threshold = 1.0E-06

     mixing beta = 0.7000

     number of iterations used = 8 plain mixing

     Exchange-correlation = SLA PZ NOGX NOGC (1100)

 

     celldm(1)= 10.200000 celldm(2)= 0.000000 celldm(3)= 0.000000

     celldm(4)= 0.000000 celldm(5)= 0.000000 celldm(6)= 0.000000

 

     crystal axes: (cart. coord. in units of a_0)

               a(1) = ( -0.500000 0.000000 0.500000 )

               a(2) = ( 0.000000 0.500000 0.500000 )

               a(3) = ( -0.500000 0.500000 0.000000 )

 

     reciprocal axes: (cart. coord. in units 2 pi/a_0)

               b(1) = ( -1.000000 -1.000000 1.000000 )

               b(2) = ( 1.000000 1.000000 1.000000 )

               b(3) = ( -1.000000 1.000000 -1.000000 )

 

 

     PseudoPot. # 1 for Si read from file Si.vbc.UPF

     Pseudo is Norm-conserving, Zval = 4.0

     Generated by new atomic code, or converted to UPF format

     Using radial grid of 431 points, 2 beta functions with:

                l(1) = 0

                l(2) = 1

 

     atomic species valence mass pseudopotential

        Si 4.00 28.08600 Si( 1.00)

 

     48 Sym.Ops. (with inversion)

 

 

   Cartesian axes

 

     site n. atom positions (a_0 units)

         1 Si tau( 1) = ( 0.0000000 0.0000000
0.0000000 )

         2 Si tau( 2) = ( 0.2500000 0.2500000
0.2500000 )

 

     number of k points= 2

                       cart. coord. in units 2pi/a_0

        k( 1) = ( 0.2500000 0.2500000 0.2500000), wk =
0.5000000

        k( 2) = ( 0.2500000 0.2500000 0.7500000), wk =
1.5000000

 

     G cutoff = 126.4975 ( 1459 G-vectors) FFT grid: ( 16, 16,
16)

 

     Largest allocated arrays est. size (Mb) dimensions

        Kohn-Sham Wavefunctions 0.00 Mb ( 51, 4)

        NL pseudopotentials 0.01 Mb ( 51, 8)

        Each V/rho on FFT grid 0.02 Mb ( 1024)

        Each G-vector array 0.00 Mb ( 366)

        G-vector shells 0.00 Mb ( 42)

     Largest temporary arrays est. size (Mb) dimensions

        Auxiliary wavefunctions 0.01 Mb ( 51, 16)

        Each subspace H/S matrix 0.00 Mb ( 16, 16)

        Each <psi_i|beta_j> matrix 0.00 Mb ( 8, 4)

        Arrays for rho mixing 0.13 Mb ( 1024, 8)

 

     Initial potential from superposition of free atoms

 

     starting charge 7.99901, renormalised to 8.00000

     Starting wfc are 8 atomic wfcs

 

     total cpu time spent up to now is 0.10 secs

 

     per-process dynamical memory: 21.9 Mb

 

     Self-consistent Calculation

 

     iteration # 1 ecut= 12.00 Ry beta=0.70

     Davidson diagonalization with overlap

     ethr = 1.00E-02, avg # of iterations = 2.0

 

     Threshold (ethr) on eigenvalues was too large:

     Diagonalizing with lowered threshold

 

     Davidson diagonalization with overlap

     ethr = 7.93E-04, avg # of iterations = 1.0

 

     total cpu time spent up to now is 0.13 secs

 

     total energy = -15.79103983 Ry

     Harris-Foulkes estimate = -15.81239602 Ry

     estimated scf accuracy < 0.06375741 Ry

 

     iteration # 2 ecut= 12.00 Ry beta=0.70

     Davidson diagonalization with overlap

     ethr = 7.97E-04, avg # of iterations = 1.0

 

     total cpu time spent up to now is 0.15 secs

 

     total energy = -15.79409517 Ry

     Harris-Foulkes estimate = -15.79442220 Ry

     estimated scf accuracy < 0.00230261 Ry

 

     iteration # 3 ecut= 12.00 Ry beta=0.70

     Davidson diagonalization with overlap

     ethr = 2.88E-05, avg # of iterations = 2.0

 

     total cpu time spent up to now is 0.17 secs

 

     total energy = -15.79447768 Ry

     Harris-Foulkes estimate = -15.79450039 Ry

     estimated scf accuracy < 0.00006345 Ry

 

     iteration # 4 ecut= 12.00 Ry beta=0.70

     Davidson diagonalization with overlap

     ethr = 7.93E-07, avg # of iterations = 2.0

 

     total cpu time spent up to now is 0.19 secs

 

     total energy = -15.79449472 Ry

     Harris-Foulkes estimate = -15.79449644 Ry

     estimated scf accuracy < 0.00000455 Ry

 

     iteration # 5 ecut= 12.00 Ry beta=0.70

     Davidson diagonalization with overlap

     ethr = 5.69E-08, avg # of iterations = 2.5

 

     total cpu time spent up to now is 0.21 secs

 

     End of self-consistent calculation

 

          k = 0.2500 0.2500 0.2500 ( 180 PWs) bands (ev):

 

    -4.8701 2.3792 5.5371 5.5371

 

          k = 0.2500 0.2500 0.7500 ( 186 PWs) bands (ev):

 

    -2.9165 -0.0653 2.6795 4.0355

 

! total energy = -15.79449556 Ry

     Harris-Foulkes estimate = -15.79449558 Ry

     estimated scf accuracy < 0.00000005 Ry

 

     The total energy is the sum of the following terms:

 

     one-electron contribution = 4.83378726 Ry

     hartree contribution = 1.08428951 Ry

     xc contribution = -4.81281375 Ry

     ewald contribution = -16.89975858 Ry

 

     convergence has been achieved in 5 iterations

 

 

     entering subroutine stress ...

 

          total stress (Ry/bohr**3) (kbar) P=
-30.30

  -0.00020597 0.00000000 0.00000000 -30.30 0.00
0.00

   0.00000000 -0.00020597 0.00000000 0.00 -30.30
0.00

   0.00000000 0.00000000 -0.00020597 0.00 0.00
-30.30

 

 

     Writing output data file pwscf.save

 

     PWSCF : 0.28s CPU time, 0.39s wall time

 

     init_run : 0.05s CPU

     electrons : 0.11s CPU

     stress : 0.00s CPU

 

     Called by init_run:

     wfcinit : 0.01s CPU

     potinit : 0.00s CPU

 

     Called by electrons:

     c_bands : 0.09s CPU ( 6 calls, 0.015 s avg)

     sum_band : 0.01s CPU ( 6 calls, 0.001 s avg)

     v_of_rho : 0.00s CPU ( 6 calls, 0.001 s avg)

     mix_rho : 0.00s CPU ( 6 calls, 0.000 s avg)

 

     Called by c_bands:

     init_us_2 : 0.00s CPU ( 28 calls, 0.000 s avg)

     cegterg : 0.09s CPU ( 12 calls, 0.007 s avg)

 

     Called by *egterg:

     h_psi : 0.01s CPU ( 35 calls, 0.000 s avg)

     g_psi : 0.00s CPU ( 21 calls, 0.000 s avg)

     cdiaghg : 0.06s CPU ( 31 calls, 0.002 s avg)

 

     Called by h_psi:

     add_vuspsi : 0.00s CPU ( 35 calls, 0.000 s avg)

 

     General routines

     calbec : 0.00s CPU ( 37 calls, 0.000 s avg)

     cft3s : 0.02s CPU ( 354 calls, 0.000 s avg)

     davcio : 0.00s CPU ( 40 calls, 0.000 s avg)

 

     Parallel routines

     fft_scatter : 0.01s CPU ( 354 calls, 0.000 s avg)

 

 

Mehdi Bozzo-Rey <mailto:mbozzore_at_[hidden]>

Open Source Solution Developer

Platform OCS5
<http://www.platform.com/Products/platform-open-cluster-stack5>

Platform computing

Phone: +1 905 948 4649

 

 

 

 

From: users-bounces_at_[hidden] [mailto:users-bounces_at_[hidden]] On
Behalf Of C.Y. Lee
Sent: August-15-08 1:03 PM
To: users_at_[hidden]
Subject: [OMPI users] Segmentation fault (11) Address not mapped (1)

 

All,

 

I had a similar problem as James described in an earlier message:
http://www.open-mpi.org/community/lists/users/2008/07/6204.php

While he was able to recompile openmpi to solve the problem, I had no
luck with my RedHat Enterprise 5 system.

Here are two other threads with similar issues regarding openmpi on
Ubuntu and OSX which were solved:
https://bugs.launchpad.net/ubuntu/+source/binutils/+bug/234837

http://www.somewhereville.com/?cat=55

 

Now...

Here is my story:

I had Quantum Espresso (QE) running without problem using openmpi.

However, when I tried to recompile QE with a recompiled fftw-2.1.5, it
compiled without any error. But when I ran QE, it gave me the error
below:

 

*** Process received signal ***
Signal: Segmentation fault (11)

Signal code: Address not mapped (1)
Failing at address: 0x22071b70
[ 0] /lib64/libpthread.so.0 [0x352420de70]
[ 1] /usr/lib64/liblapack.so.3(dsytf2_+0xc43) [0x2aaaaac9f5e3]
[ 2] /usr/lib64/liblapack.so.3(dsytrf_+0x407) [0x2aaaaaca0567]
[ 3] /opt/espresso-4.0.1/bin/pw.x(mix_rho_+0x828) [0x5044b8]
[ 4] /opt/espresso-4.0.1/bin/pw.x(electrons_+0xb37) [0x4eae47]
[ 5] /opt/espresso-4.0.1/bin/pw.x(MAIN__+0xbf) [0x42b3af]
[ 6] /opt/espresso-4.0.1/bin/pw.x(main+0xe) [0x6aad5e]
[ 7] /lib64/libc.so.6(__libc_start_main+0xf4) [0x352361d8a4]
[ 8] /opt/espresso-4.0.1/bin/pw.x [0x42b239]
 *** End of error message ***

 

>From what I read from the above links, it seems to be a bug in openmpi.

Please share your thoughts on this, thank you!

 

CY