Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] Error Running Executable Linking C++, C, F77 and F90
From: Si Hammond (simon.hammond_at_[hidden])
Date: 2008-02-25 16:13:55


Hi Guys,

We have a very large executable written in C++, C, F77 and F90 (and we
use all of these compilers!). Our code compiles and links fine but when
we run it on our cluster (under PBSPro) we get the errors at the bottom
of the email.
I wondered if you guys could shed any light on this? Seems to be an odd
error than MPI_COMM_WORLD is an invalid communicator? Do you think its a
hardware fault or a compilation issue? For reference we're using OpenMPI
1.2.5 with InfiniBand connected via a Voltaire switch. Processors are
Intel Dual Core. Compilers are GNU C (gcc), C++ (g++) and gfortran.

[node207:12109] *** An error occurred in MPI_Allreduce
[node109:11337] *** An error occurred in MPI_Allreduce
[node109:11337] *** on communicator MPI_COMM_WORLD
[node109:11337] *** MPI_ERR_COMM: invalid communicator
[node109:11337] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node117:11236] *** An error occurred in MPI_Allreduce
[node117:11236] *** on communicator MPI_COMM_WORLD
[node117:11236] *** MPI_ERR_COMM: invalid communicator
[node117:11236] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node113:11288] *** An error occurred in MPI_Allreduce
[node113:11288] *** on communicator MPI_COMM_WORLD
[node113:11288] *** MPI_ERR_COMM: invalid communicator
[node113:11288] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node111:11295] *** An error occurred in MPI_Allreduce
[node111:11295] *** on communicator MPI_COMM_WORLD
[node111:11295] *** MPI_ERR_COMM: invalid communicator
[node111:11295] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node110:11295] *** An error occurred in MPI_Allreduce
[node110:11295] *** on communicator MPI_COMM_WORLD
[node110:11295] *** MPI_ERR_COMM: invalid communicator
[node110:11295] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node115:11496] *** An error occurred in MPI_Allreduce
[node115:11496] *** on communicator MPI_COMM_WORLD
[node115:11496] *** MPI_ERR_COMM: invalid communicator
[node115:11496] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node118:11239] *** An error occurred in MPI_Allreduce
[node118:11239] *** on communicator MPI_COMM_WORLD
[node118:11239] *** MPI_ERR_COMM: invalid communicator
[node118:11239] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node116:11249] *** An error occurred in MPI_Allreduce
[node116:11249] *** on communicator MPI_COMM_WORLD
[node116:11249] *** MPI_ERR_COMM: invalid communicator
[node116:11249] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node119:11239] *** An error occurred in MPI_Allreduce
[node119:11239] *** on communicator MPI_COMM_WORLD
[node119:11239] *** MPI_ERR_COMM: invalid communicator
[node119:11239] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node207:12109] *** on communicator MPI_COMM_WORLD
[node207:12109] *** MPI_ERR_COMM: invalid communicator
[node207:12109] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node114:11261] *** An error occurred in MPI_Allreduce
[node114:11261] *** on communicator MPI_COMM_WORLD
[node114:11261] *** MPI_ERR_COMM: invalid communicator
[node114:11261] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node206:12030] *** An error occurred in MPI_Allreduce
[node206:12030] *** on communicator MPI_COMM_WORLD
[node206:12030] *** MPI_ERR_COMM: invalid communicator
[node206:12030] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node117:11237] *** An error occurred in MPI_Allreduce
[node113:11287] *** An error occurred in MPI_Allreduce
[node111:11293] *** An error occurred in MPI_Allreduce
[node110:11293] *** An error occurred in MPI_Allreduce
[node110:11293] *** on communicator MPI_COMM_WORLD
[node110:11293] *** MPI_ERR_COMM: invalid communicator
[node110:11293] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node115:11495] *** An error occurred in MPI_Allreduce
[node118:11237] *** An error occurred in MPI_Allreduce
[node118:11237] *** on communicator MPI_COMM_WORLD
[node118:11237] *** MPI_ERR_COMM: invalid communicator
[node118:11237] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node116:11247] *** An error occurred in MPI_Allreduce
[node116:11247] *** on communicator MPI_COMM_WORLD
[node116:11247] *** MPI_ERR_COMM: invalid communicator
[node116:11247] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node119:11238] *** An error occurred in MPI_Allreduce
[node114:11262] *** An error occurred in MPI_Allreduce
[node206:12029] *** An error occurred in MPI_Allreduce
[node206:12029] *** on communicator MPI_COMM_WORLD
[node206:12029] *** MPI_ERR_COMM: invalid communicator
[node206:12029] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node117:11238] *** An error occurred in MPI_Allreduce
[node113:11289] *** An error occurred in MPI_Allreduce
[node111:11294] *** An error occurred in MPI_Allreduce
[node110:11294] *** An error occurred in MPI_Allreduce
[node110:11294] *** on communicator MPI_COMM_WORLD
[node110:11294] *** MPI_ERR_COMM: invalid communicator
[node110:11294] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node115:11497] *** An error occurred in MPI_Allreduce
[node115:11497] *** on communicator MPI_COMM_WORLD
[node118:11238] *** An error occurred in MPI_Allreduce
[node118:11238] *** on communicator MPI_COMM_WORLD
[node118:11238] *** MPI_ERR_COMM: invalid communicator
[node118:11238] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node116:11248] *** An error occurred in MPI_Allreduce
[node116:11248] *** on communicator MPI_COMM_WORLD
[node116:11248] *** MPI_ERR_COMM: invalid communicator
[node116:11248] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node119:11240] *** An error occurred in MPI_Allreduce
[node114:11263] *** An error occurred in MPI_Allreduce
[node114:11263] *** on communicator MPI_COMM_WORLD
[node114:11263] *** MPI_ERR_COMM: invalid communicator
[node114:11263] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node206:12031] *** An error occurred in MPI_Allreduce
[node206:12031] *** on communicator MPI_COMM_WORLD
[node206:12031] *** MPI_ERR_COMM: invalid communicator
[node206:12031] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node117:11237] *** on communicator MPI_COMM_WORLD
[node117:11237] *** MPI_ERR_COMM: invalid communicator
[node117:11237] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node113:11287] *** on communicator MPI_COMM_WORLD
[node113:11287] *** MPI_ERR_COMM: invalid communicator
[node113:11287] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node111:11293] *** on communicator MPI_COMM_WORLD
[node111:11293] *** MPI_ERR_COMM: invalid communicator
[node111:11293] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node115:11495] *** on communicator MPI_COMM_WORLD
[node115:11495] *** MPI_ERR_COMM: invalid communicator
[node115:11495] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node119:11238] *** on communicator MPI_COMM_WORLD
[node119:11238] *** MPI_ERR_COMM: invalid communicator
[node119:11238] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node114:11262] *** on communicator MPI_COMM_WORLD
[node114:11262] *** MPI_ERR_COMM: invalid communicator
[node114:11262] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node117:11238] *** on communicator MPI_COMM_WORLD
[node117:11238] *** MPI_ERR_COMM: invalid communicator
[node117:11238] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node113:11289] *** on communicator MPI_COMM_WORLD
[node113:11289] *** MPI_ERR_COMM: invalid communicator
[node113:11289] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node111:11294] *** on communicator MPI_COMM_WORLD
[node111:11294] *** MPI_ERR_COMM: invalid communicator
[node111:11294] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node115:11497] *** MPI_ERR_COMM: invalid communicator
[node115:11497] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node119:11240] *** on communicator MPI_COMM_WORLD
[node119:11240] *** MPI_ERR_COMM: invalid communicator
[node119:11240] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node109:11335] [0,0,0] ORTE_ERROR_LOG: Timeout in file
base/pls_base_orted_cmds.c at line 275
[node109:11335] [0,0,0] ORTE_ERROR_LOG: Timeout in file pls_tm_module.c
at line 572
[node109:11335] [0,0,0] ORTE_ERROR_LOG: Timeout in file errmgr_hnp.c at
line 90
[node109:11335] [0,0,0] ORTE_ERROR_LOG: Timeout in file
base/pls_base_orted_cmds.c at line 188
[node109:11335] [0,0,0] ORTE_ERROR_LOG: Timeout in file pls_tm_module.c
at line 603
--------------------------------------------------------------------------
mpirun was unable to cleanly terminate the daemons for this job.
Returned value Timeout instead of ORTE_SUCCESS.
--------------------------------------------------------------------------
[node117:11235] OOB: Connection to HNP lost
[node113:11286] OOB: Connection to HNP lost
[node111:11292] OOB: Connection to HNP lost
[node115:11494] OOB: Connection to HNP lost
[node119:11237] OOB: Connection to HNP lost
[node116:11246] OOB: Connection to HNP lost
[node206:12028] OOB: Connection to HNP lost
[node114:11260] OOB: Connection to HNP lost

----------------------------------------------------------------------------------------------------------

OMPI Info Output

Open MPI: 1.2.5
    Open MPI SVN revision: r16989
                 Open RTE: 1.2.5
    Open RTE SVN revision: r16989
                     OPAL: 1.2.5
        OPAL SVN revision: r16989
                   Prefix: /opt/ompi/1.2.5/gnu/64
  Configured architecture: x86_64-unknown-linux-gnu
            Configured by: root
            Configured on: Sun Jan 20 13:29:39 GMT 2008
           Configure host: mg1
                 Built by: root
                 Built on: Sun Jan 20 13:37:14 GMT 2008
               Built host: mg1
               C bindings: yes
             C++ bindings: yes
       Fortran77 bindings: yes (all)
       Fortran90 bindings: yes
  Fortran90 bindings size: small
               C compiler: gcc
      C compiler absolute: /usr/bin/gcc
             C++ compiler: g++
    C++ compiler absolute: /usr/bin/g++
       Fortran77 compiler: gfortran
   Fortran77 compiler abs: /usr/bin/gfortran
       Fortran90 compiler: gfortran
   Fortran90 compiler abs: /usr/bin/gfortran
              C profiling: yes
            C++ profiling: yes
      Fortran77 profiling: yes
      Fortran90 profiling: yes
           C++ exceptions: no
           Thread support: posix (mpi: no, progress: no)
   Internal debug support: no
      MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
          libltdl support: yes
    Heterogeneous support: yes
  mpirun default --prefix: no
            MCA backtrace: execinfo (MCA v1.0, API v1.0, Component v1.2.5)
               MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component v1.2.5)
            MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.2.5)
            MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.2.5)
            MCA maffinity: libnuma (MCA v1.0, API v1.0, Component v1.2.5)
                MCA timer: linux (MCA v1.0, API v1.0, Component v1.2.5)
          MCA installdirs: env (MCA v1.0, API v1.0, Component v1.2.5)
          MCA installdirs: config (MCA v1.0, API v1.0, Component v1.2.5)
            MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
            MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
                 MCA coll: basic (MCA v1.0, API v1.0, Component v1.2.5)
                 MCA coll: self (MCA v1.0, API v1.0, Component v1.2.5)
                 MCA coll: sm (MCA v1.0, API v1.0, Component v1.2.5)
                 MCA coll: tuned (MCA v1.0, API v1.0, Component v1.2.5)
                   MCA io: romio (MCA v1.0, API v1.0, Component v1.2.5)
                MCA mpool: rdma (MCA v1.0, API v1.0, Component v1.2.5)
                MCA mpool: sm (MCA v1.0, API v1.0, Component v1.2.5)
                  MCA pml: cm (MCA v1.0, API v1.0, Component v1.2.5)
                  MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.2.5)
                  MCA bml: r2 (MCA v1.0, API v1.0, Component v1.2.5)
               MCA rcache: vma (MCA v1.0, API v1.0, Component v1.2.5)
                  MCA btl: openib (MCA v1.0, API v1.0.1, Component v1.2.5)
                  MCA btl: self (MCA v1.0, API v1.0.1, Component v1.2.5)
                  MCA btl: sm (MCA v1.0, API v1.0.1, Component v1.2.5)
                  MCA btl: tcp (MCA v1.0, API v1.0.1, Component v1.0)
                  MCA mtl: psm (MCA v1.0, API v1.0, Component v1.2.5)
                 MCA topo: unity (MCA v1.0, API v1.0, Component v1.2.5)
                  MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.2.5)
               MCA errmgr: hnp (MCA v1.0, API v1.3, Component v1.2.5)
               MCA errmgr: orted (MCA v1.0, API v1.3, Component v1.2.5)
               MCA errmgr: proxy (MCA v1.0, API v1.3, Component v1.2.5)
                  MCA gpr: null (MCA v1.0, API v1.0, Component v1.2.5)
                  MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.2.5)
                  MCA gpr: replica (MCA v1.0, API v1.0, Component v1.2.5)
                  MCA iof: proxy (MCA v1.0, API v1.0, Component v1.2.5)
                  MCA iof: svc (MCA v1.0, API v1.0, Component v1.2.5)
                   MCA ns: proxy (MCA v1.0, API v2.0, Component v1.2.5)
                   MCA ns: replica (MCA v1.0, API v2.0, Component v1.2.5)
                  MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
                  MCA ras: dash_host (MCA v1.0, API v1.3, Component v1.2.5)
                  MCA ras: gridengine (MCA v1.0, API v1.3, Component v1.2.5)
                  MCA ras: localhost (MCA v1.0, API v1.3, Component v1.2.5)
                  MCA ras: slurm (MCA v1.0, API v1.3, Component v1.2.5)
                  MCA ras: tm (MCA v1.0, API v1.3, Component v1.2.5)
                  MCA rds: hostfile (MCA v1.0, API v1.3, Component v1.2.5)
                  MCA rds: proxy (MCA v1.0, API v1.3, Component v1.2.5)
                  MCA rds: resfile (MCA v1.0, API v1.3, Component v1.2.5)
                MCA rmaps: round_robin (MCA v1.0, API v1.3, Component
v1.2.5)
                 MCA rmgr: proxy (MCA v1.0, API v2.0, Component v1.2.5)
                 MCA rmgr: urm (MCA v1.0, API v2.0, Component v1.2.5)
                  MCA rml: oob (MCA v1.0, API v1.0, Component v1.2.5)
                  MCA pls: gridengine (MCA v1.0, API v1.3, Component v1.2.5)
                  MCA pls: proxy (MCA v1.0, API v1.3, Component v1.2.5)
                  MCA pls: rsh (MCA v1.0, API v1.3, Component v1.2.5)
                  MCA pls: slurm (MCA v1.0, API v1.3, Component v1.2.5)
                  MCA pls: tm (MCA v1.0, API v1.3, Component v1.2.5)
                  MCA sds: env (MCA v1.0, API v1.0, Component v1.2.5)
                  MCA sds: pipe (MCA v1.0, API v1.0, Component v1.2.5)
                  MCA sds: seed (MCA v1.0, API v1.0, Component v1.2.5)
                  MCA sds: singleton (MCA v1.0, API v1.0, Component v1.2.5)
                  MCA sds: slurm (MCA v1.0, API v1.0, Component v1.2.5)

-- 
Si Hammond
Performance Prediction and Analysis Lab,
High Performance Systems Group,
University of Warwick, UK