Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Patrick Jessee (pj_at_[hidden])
Date: 2006-06-26 16:10:37


Hello. This may be a usage issue, but we did not have this problem with
version 1.0.2.

When starting a parallel job (using TCP) with mpirun, the following
message is repeated many times:

[devi01:24440] mca_oob_tcp_accept: accept() failed with errno 9.
[devi01:24440] mca_oob_tcp_accept: accept() failed with errno 9.
[devi01:24440] mca_oob_tcp_accept: accept() failed with errno 9.
[devi01:24440] mca_oob_tcp_accept: accept() failed with errno 9.
[devi01:24440] mca_oob_tcp_accept: accept() failed with errno 9.
[devi01:24440] mca_oob_tcp_accept: accept() failed with errno 9.
:

The job is started with the following command:

mpirun --prefix <path_to_openmpi> --x LD_LIBRARY_PATH --mca btl
sm,self,tcp --np 2 --host devi01 <program_name>

(note: <path_to_openmpi> and <program_name> are not literal)

ompi_info gives the following output:

> ompi_info
             Open MPI: 1.1
Open MPI SVN revision: r10477
             Open RTE: 1.1
Open RTE SVN revision: r10477
                 OPAL: 1.1
    OPAL SVN revision: r10477
               Prefix: <path_to_openmpi>
Configured architecture: x86_64-unknown-linux-gnu
        Configured by: devuser
        Configured on: Mon Jun 26 15:00:16 EDT 2006
       Configure host: cello
             Built by: devuser
             Built on: Mon Jun 26 15:09:30 EDT 2006
           Built host: cello
           C bindings: yes
         C++ bindings: no
   Fortran77 bindings: no
   Fortran90 bindings: no
Fortran90 bindings size: na
           C compiler: gcc
  C compiler absolute: /usr/bin/gcc
         C++ compiler: g++
C++ compiler absolute: /usr/bin/g++
   Fortran77 compiler: g77
Fortran77 compiler abs: /usr/bin/g77
   Fortran90 compiler: none
Fortran90 compiler abs: none
          C profiling: yes
        C++ profiling: yes
  Fortran77 profiling: no
  Fortran90 profiling: no
       C++ exceptions: no
       Thread support: posix (mpi: no, progress: no)
Internal debug support: no
  MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
      libltdl support: yes
           MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component v1.1)
        MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.1)
        MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.1)
        MCA maffinity: libnuma (MCA v1.0, API v1.0, Component v1.1)
            MCA timer: linux (MCA v1.0, API v1.0, Component v1.1)
        MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
        MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
             MCA coll: basic (MCA v1.0, API v1.0, Component v1.1)
             MCA coll: hierarch (MCA v1.0, API v1.0, Component v1.1)
             MCA coll: self (MCA v1.0, API v1.0, Component v1.1)
             MCA coll: sm (MCA v1.0, API v1.0, Component v1.1)
             MCA coll: tuned (MCA v1.0, API v1.0, Component v1.1)
               MCA io: romio (MCA v1.0, API v1.0, Component v1.1)
            MCA mpool: sm (MCA v1.0, API v1.0, Component v1.1)
              MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.1)
              MCA bml: r2 (MCA v1.0, API v1.0, Component v1.1)
           MCA rcache: rb (MCA v1.0, API v1.0, Component v1.1)
              MCA btl: self (MCA v1.0, API v1.0, Component v1.1)
              MCA btl: sm (MCA v1.0, API v1.0, Component v1.1)
              MCA btl: tcp (MCA v1.0, API v1.0, Component v1.0)
             MCA topo: unity (MCA v1.0, API v1.0, Component v1.1)
              MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.0)
              MCA gpr: null (MCA v1.0, API v1.0, Component v1.1)
              MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.1)
              MCA gpr: replica (MCA v1.0, API v1.0, Component v1.1)
              MCA iof: proxy (MCA v1.0, API v1.0, Component v1.1)
              MCA iof: svc (MCA v1.0, API v1.0, Component v1.1)
               MCA ns: proxy (MCA v1.0, API v1.0, Component v1.1)
               MCA ns: replica (MCA v1.0, API v1.0, Component v1.1)
              MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
              MCA ras: dash_host (MCA v1.0, API v1.0, Component v1.1)
              MCA ras: hostfile (MCA v1.0, API v1.0, Component v1.1)
              MCA ras: localhost (MCA v1.0, API v1.0, Component v1.1)
              MCA ras: slurm (MCA v1.0, API v1.0, Component v1.1)
              MCA rds: hostfile (MCA v1.0, API v1.0, Component v1.1)
              MCA rds: resfile (MCA v1.0, API v1.0, Component v1.1)
            MCA rmaps: round_robin (MCA v1.0, API v1.0, Component v1.1)
             MCA rmgr: proxy (MCA v1.0, API v1.0, Component v1.1)
             MCA rmgr: urm (MCA v1.0, API v1.0, Component v1.1)
              MCA rml: oob (MCA v1.0, API v1.0, Component v1.1)
              MCA pls: fork (MCA v1.0, API v1.0, Component v1.1)
              MCA pls: rsh (MCA v1.0, API v1.0, Component v1.1)
              MCA pls: slurm (MCA v1.0, API v1.0, Component v1.1)
              MCA sds: env (MCA v1.0, API v1.0, Component v1.1)
              MCA sds: pipe (MCA v1.0, API v1.0, Component v1.1)
              MCA sds: seed (MCA v1.0, API v1.0, Component v1.1)
              MCA sds: singleton (MCA v1.0, API v1.0, Component v1.1)
              MCA sds: slurm (MCA v1.0, API v1.0, Component v1.1)

-----------

The problem happens regardless if only a local node is involved or if
remote nodes are involved.
Any ideas what the issue is? Again, no problems like this with 1.0.2.
Thanks,

-Patrick