Hi,

 

Can you give more info about the compilation steps, I just recompiled it (using the internal stuff except for fftw) and was able to run an example (output below). Did I miss something ?

 

I recompiled / ran on a Platform OCS 5 cluster (based on RHEL 5), with IB support (OFED)

 

Partial ompi_info :

 

                Open MPI: 1.2.6

   Open MPI SVN revision: r17946

                Open RTE: 1.2.6

   Open RTE SVN revision: r17946

                    OPAL: 1.2.6

       OPAL SVN revision: r17946

                  Prefix: /home/mbozzore/openmpi

 Configured architecture: x86_64-unknown-linux-gnu

           Configured by: mbozzore

           Configured on: Mon Aug 11 00:29:15 EDT 2008

          Configure host: tyan04.lsf.platform.com

                Built by: mbozzore

                Built on: Mon Aug 11 00:33:54 EDT 2008

              Built host: tyan04.lsf.platform.com

              C bindings: yes

            C++ bindings: yes

      Fortran77 bindings: yes (all)

      Fortran90 bindings: yes

 Fortran90 bindings size: small

              C compiler: gcc

     C compiler absolute: /usr/bin/gcc

            C++ compiler: g++

   C++ compiler absolute: /usr/bin/g++

      Fortran77 compiler: gfortran

  Fortran77 compiler abs: /usr/bin/gfortran

      Fortran90 compiler: gfortran

  Fortran90 compiler abs: /usr/bin/gfortran

 

 

 

 [mbozzore@tyan04 tests]$ mpirun -np 4 --machinefile ./hosts -x LD_LIBRARY_PATH --mca btl openib,self ../bin/pw.x < scf.in

 

     Program PWSCF     v.4.0.1  starts ...

     Today is 15Aug2008 at 14:51:18

 

     Parallel version (MPI)

 

     Number of processors in use:       4

     R & G space division:  proc/pool =    4

 

     For Norm-Conserving or Ultrasoft (Vanderbilt) Pseudopotentials or PAW

 

     Current dimensions of program pwscf are:

     Max number of different atomic species (ntypx) = 10

     Max number of k-points (npk) =  40000

     Max angular momentum in pseudopotentials (lmaxx) =  3

 

     Iterative solution of the eigenvalue problem

 

     a parallel distributed memory algorithm will be used,

     eigenstates matrixes will be distributed block like on

     ortho sub-group =    2*   2 procs

 

 

     Planes per process (thick) : nr3 = 16 npp =   4 ncplane =  256

 

     Proc/  planes cols     G    planes cols    G      columns  G

     Pool       (dense grid)       (smooth grid)      (wavefct grid)

       1      4     41      366    4     41      366     13       70

       2      4     41      366    4     41      366     14       71

       3      4     40      362    4     40      362     14       71

       4      4     41      365    4     41      365     14       71

     tot     16    163     1459   16    163     1459     55      283

 

 

 

     bravais-lattice index     =            2

     lattice parameter (a_0)   =      10.2000  a.u.

     unit-cell volume          =     265.3020 (a.u.)^3

     number of atoms/cell      =            2

     number of atomic types    =            1

     number of electrons       =         8.00

     number of Kohn-Sham states=            4

     kinetic-energy cutoff     =      12.0000  Ry

     charge density cutoff     =      48.0000  Ry

     convergence threshold     =      1.0E-06

     mixing beta               =       0.7000

     number of iterations used =            8  plain     mixing

     Exchange-correlation      =  SLA  PZ   NOGX NOGC (1100)

 

     celldm(1)=  10.200000  celldm(2)=   0.000000  celldm(3)=   0.000000

     celldm(4)=   0.000000  celldm(5)=   0.000000  celldm(6)=   0.000000

 

     crystal axes: (cart. coord. in units of a_0)

               a(1) = ( -0.500000  0.000000  0.500000 )

               a(2) = (  0.000000  0.500000  0.500000 )

               a(3) = ( -0.500000  0.500000  0.000000 )

 

     reciprocal axes: (cart. coord. in units 2 pi/a_0)

               b(1) = ( -1.000000 -1.000000  1.000000 )

               b(2) = (  1.000000  1.000000  1.000000 )

               b(3) = ( -1.000000  1.000000 -1.000000 )

 

 

     PseudoPot. # 1 for Si read from file Si.vbc.UPF

     Pseudo is Norm-conserving, Zval =  4.0

     Generated by new atomic code, or converted to UPF format

     Using radial grid of  431 points,  2 beta functions with:

                l(1) =   0

                l(2) =   1

 

     atomic species   valence    mass     pseudopotential

        Si             4.00    28.08600     Si( 1.00)

 

     48 Sym.Ops. (with inversion)

 

 

   Cartesian axes

 

     site n.     atom                  positions (a_0 units)

         1           Si  tau(  1) = (   0.0000000   0.0000000   0.0000000  )

         2           Si  tau(  2) = (   0.2500000   0.2500000   0.2500000  )

 

     number of k points=    2

                       cart. coord. in units 2pi/a_0

        k(    1) = (   0.2500000   0.2500000   0.2500000), wk =   0.5000000

        k(    2) = (   0.2500000   0.2500000   0.7500000), wk =   1.5000000

 

     G cutoff =  126.4975  (   1459 G-vectors)     FFT grid: ( 16, 16, 16)

 

     Largest allocated arrays     est. size (Mb)     dimensions

        Kohn-Sham Wavefunctions         0.00 Mb     (     51,   4)

        NL pseudopotentials             0.01 Mb     (     51,   8)

        Each V/rho on FFT grid          0.02 Mb     (   1024)

        Each G-vector array             0.00 Mb     (    366)

        G-vector shells                 0.00 Mb     (     42)

     Largest temporary arrays     est. size (Mb)     dimensions

        Auxiliary wavefunctions         0.01 Mb     (     51,  16)

        Each subspace H/S matrix        0.00 Mb     (     16,  16)

        Each <psi_i|beta_j> matrix      0.00 Mb     (      8,   4)

        Arrays for rho mixing           0.13 Mb     (   1024,   8)

 

     Initial potential from superposition of free atoms

 

     starting charge    7.99901, renormalised to    8.00000

     Starting wfc are    8 atomic wfcs

 

     total cpu time spent up to now is      0.10 secs

 

     per-process dynamical memory:    21.9 Mb

 

     Self-consistent Calculation

 

     iteration #  1     ecut=    12.00 Ry     beta=0.70

     Davidson diagonalization with overlap

     ethr =  1.00E-02,  avg # of iterations =  2.0

 

     Threshold (ethr) on eigenvalues was too large:

     Diagonalizing with lowered threshold

 

     Davidson diagonalization with overlap

     ethr =  7.93E-04,  avg # of iterations =  1.0

 

     total cpu time spent up to now is      0.13 secs

 

     total energy              =   -15.79103983 Ry

     Harris-Foulkes estimate   =   -15.81239602 Ry

     estimated scf accuracy    <     0.06375741 Ry

 

     iteration #  2     ecut=    12.00 Ry     beta=0.70

     Davidson diagonalization with overlap

     ethr =  7.97E-04,  avg # of iterations =  1.0

 

     total cpu time spent up to now is      0.15 secs

 

     total energy              =   -15.79409517 Ry

     Harris-Foulkes estimate   =   -15.79442220 Ry

     estimated scf accuracy    <     0.00230261 Ry

 

     iteration #  3     ecut=    12.00 Ry     beta=0.70

     Davidson diagonalization with overlap

     ethr =  2.88E-05,  avg # of iterations =  2.0

 

     total cpu time spent up to now is      0.17 secs

 

     total energy              =   -15.79447768 Ry

     Harris-Foulkes estimate   =   -15.79450039 Ry

     estimated scf accuracy    <     0.00006345 Ry

 

     iteration #  4     ecut=    12.00 Ry     beta=0.70

     Davidson diagonalization with overlap

     ethr =  7.93E-07,  avg # of iterations =  2.0

 

     total cpu time spent up to now is      0.19 secs

 

     total energy              =   -15.79449472 Ry

     Harris-Foulkes estimate   =   -15.79449644 Ry

     estimated scf accuracy    <     0.00000455 Ry

 

     iteration #  5     ecut=    12.00 Ry     beta=0.70

     Davidson diagonalization with overlap

     ethr =  5.69E-08,  avg # of iterations =  2.5

 

     total cpu time spent up to now is      0.21 secs

 

     End of self-consistent calculation

 

          k = 0.2500 0.2500 0.2500 (   180 PWs)   bands (ev):

 

    -4.8701   2.3792   5.5371   5.5371

 

          k = 0.2500 0.2500 0.7500 (   186 PWs)   bands (ev):

 

    -2.9165  -0.0653   2.6795   4.0355

 

!    total energy              =   -15.79449556 Ry

     Harris-Foulkes estimate   =   -15.79449558 Ry

     estimated scf accuracy    <     0.00000005 Ry

 

     The total energy is the sum of the following terms:

 

     one-electron contribution =     4.83378726 Ry

     hartree contribution      =     1.08428951 Ry

     xc contribution           =    -4.81281375 Ry

     ewald contribution        =   -16.89975858 Ry

 

     convergence has been achieved in   5 iterations

 

 

     entering subroutine stress ...

 

          total   stress  (Ry/bohr**3)                   (kbar)     P=  -30.30

  -0.00020597   0.00000000   0.00000000        -30.30      0.00      0.00

   0.00000000  -0.00020597   0.00000000          0.00    -30.30      0.00

   0.00000000   0.00000000  -0.00020597          0.00      0.00    -30.30

 

 

     Writing output data file pwscf.save

 

     PWSCF        :     0.28s CPU time,    0.39s wall time

 

     init_run     :     0.05s CPU

     electrons    :     0.11s CPU

     stress       :     0.00s CPU

 

     Called by init_run:

     wfcinit      :     0.01s CPU

     potinit      :     0.00s CPU

 

     Called by electrons:

     c_bands      :     0.09s CPU (       6 calls,   0.015 s avg)

     sum_band     :     0.01s CPU (       6 calls,   0.001 s avg)

     v_of_rho     :     0.00s CPU (       6 calls,   0.001 s avg)

     mix_rho      :     0.00s CPU (       6 calls,   0.000 s avg)

 

     Called by c_bands:

     init_us_2    :     0.00s CPU (      28 calls,   0.000 s avg)

     cegterg      :     0.09s CPU (      12 calls,   0.007 s avg)

 

     Called by *egterg:

     h_psi        :     0.01s CPU (      35 calls,   0.000 s avg)

     g_psi        :     0.00s CPU (      21 calls,   0.000 s avg)

     cdiaghg      :     0.06s CPU (      31 calls,   0.002 s avg)

 

     Called by h_psi:

     add_vuspsi   :     0.00s CPU (      35 calls,   0.000 s avg)

 

     General routines

     calbec       :     0.00s CPU (      37 calls,   0.000 s avg)

     cft3s        :     0.02s CPU (     354 calls,   0.000 s avg)

     davcio       :     0.00s CPU (      40 calls,   0.000 s avg)

 

     Parallel routines

     fft_scatter  :     0.01s CPU (     354 calls,   0.000 s avg)

 

 

Mehdi Bozzo-Rey

Open Source Solution Developer

Platform OCS5

Platform computing

Phone: +1 905 948 4649

 

 

 

 

From: users-bounces@open-mpi.org [mailto:users-bounces@open-mpi.org] On Behalf Of C.Y. Lee
Sent: August-15-08 1:03 PM
To: users@open-mpi.org
Subject: [OMPI users] Segmentation fault (11) Address not mapped (1)

 

All,

 

I had a similar problem as James described in an earlier message: http://www.open-mpi.org/community/lists/users/2008/07/6204.php

While he was able to recompile openmpi to solve the problem, I had no luck with my RedHat Enterprise 5 system.

Here are two other threads with similar issues regarding openmpi on Ubuntu and OSX which were solved: https://bugs.launchpad.net/ubuntu/+source/binutils/+bug/234837

http://www.somewhereville.com/?cat=55

 

Now...

Here is my story:

I had Quantum Espresso (QE) running without problem using openmpi.

However, when I tried to recompile QE with a recompiled fftw-2.1.5, it compiled without any error. But when I ran QE, it gave me the error below:

 

*** Process received signal ***
Signal: Segmentation fault (11)

Signal code: Address not mapped (1)
Failing at address: 0x22071b70
[ 0] /lib64/libpthread.so.0 [0x352420de70]
[ 1] /usr/lib64/liblapack.so.3(dsytf2_+0xc43) [0x2aaaaac9f5e3]
[ 2] /usr/lib64/liblapack.so.3(dsytrf_+0x407) [0x2aaaaaca0567]
[ 3] /opt/espresso-4.0.1/bin/pw.x(mix_rho_+0x828) [0x5044b8]
[ 4] /opt/espresso-4.0.1/bin/pw.x(electrons_+0xb37) [0x4eae47]
[ 5] /opt/espresso-4.0.1/bin/pw.x(MAIN__+0xbf) [0x42b3af]
[ 6] /opt/espresso-4.0.1/bin/pw.x(main+0xe) [0x6aad5e]
[ 7] /lib64/libc.so.6(__libc_start_main+0xf4) [0x352361d8a4]
[ 8] /opt/espresso-4.0.1/bin/pw.x [0x42b239]
 *** End of error message ***

 

From what I read from the above links, it seems to be a bug in openmpi.

Please share your thoughts on this, thank you!

 

CY