Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Jeffrey B. Layton (laytonjb_at_[hidden])
Date: 2006-10-19 15:31:46


A small update. I was looking through the error file a bit more
(it was 159MB). I found the following error message sequence:

o1:22805] mca_oob_tcp_accept: accept() failed with errno 9.
[o4:11242] [0,1,4]-[0,0,0] mca_oob_tcp_peer_recv_blocking: recv() failed
with errno=104
[o1:22805] mca_oob_tcp_accept: accept() failed with errno 9.
...
[o1:22805] mca_oob_tcp_accept: accept() failed with errno 9.
[o3:32205] [0,1,2]-[0,0,0] mca_oob_tcp_peer_complete_connect: connection
failed (errno=111) - retrying (pid=32205)
[o1:22805] mca_oob_tcp_accept: accept() failed with errno 9.
[o3:32206] [0,1,3]-[0,0,0] mca_oob_tcp_peer_complete_connect: connection
failed (errno=111) - retrying (pid=32206)
[o1:22805] mca_oob_tcp_accept: accept() failed with errno 9.
...

I don't know if this changes things (my google attempts didn't
really give me much information).

Jeff

> Good afternoon,
>
> I really hate to post asking for help with a problem, but
> my own efforts have not worked out well (probably operator
> error).
> Anyway, I'm trying to run a code that was built with PGI 6.1
> and OpenMPI-1.1.1. The mpirun command looks like:
>
> mpirun --hostfile machines.${PBS_JOBID} --np ${NP} -mca btl self,sm,tcp
> ./${EXE} ${CASEPROJ} >> OUTPUT
>
> I get the following error in the PBS error file:
>
> [o1:22559] mca_oob_tcp_accept: accept() failed with errno 9.
> ...
>
> and keeps repeating (for a long time).
>
> ompi_info gives the following output:
>
> > ompi_info
> Open MPI: 1.1.1
> Open MPI SVN revision: r11473
> Open RTE: 1.1.1
> Open RTE SVN revision: r11473
> OPAL: 1.1.1
> OPAL SVN revision: r11473
> Prefix: /usr/x86_64-pgi-6.1/openmpi-1.1.1
> Configured architecture: x86_64-suse-linux-gnu
> Configured by: root
> Configured on: Mon Oct 16 20:51:34 MDT 2006
> Configure host: lo248
> Built by: root
> Built on: Mon Oct 16 21:02:00 MDT 2006
> Built host: lo248
> C bindings: yes
> C++ bindings: yes
> Fortran77 bindings: yes (all)
> Fortran90 bindings: yes
> Fortran90 bindings size: small
> C compiler: pgcc
> C compiler absolute: /opt/pgi/linux86-64/6.1/bin/pgcc
> C++ compiler: pgCC
> C++ compiler absolute: /opt/pgi/linux86-64/6.1/bin/pgCC
> Fortran77 compiler: pgf77
> Fortran77 compiler abs: /opt/pgi/linux86-64/6.1/bin/pgf77
> Fortran90 compiler: pgf90
> Fortran90 compiler abs: /opt/pgi/linux86-64/6.1/bin/pgf90
> C profiling: yes
> C++ profiling: yes
> Fortran77 profiling: yes
> Fortran90 profiling: yes
> C++ exceptions: yes
> Thread support: posix (mpi: no, progress: no)
> Internal debug support: no
> MPI parameter check: runtime
> Memory profiling support: no
> Memory debugging support: no
> libltdl support: yes
> MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component v1.1.1)
> MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.1.1)
> MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.1.1)
> MCA maffinity: libnuma (MCA v1.0, API v1.0, Component v1.1.1)
> MCA timer: linux (MCA v1.0, API v1.0, Component v1.1.1)
> MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
> MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
> MCA coll: basic (MCA v1.0, API v1.0, Component v1.1.1)
> MCA coll: hierarch (MCA v1.0, API v1.0, Component v1.1.1)
> MCA coll: self (MCA v1.0, API v1.0, Component v1.1.1)
> MCA coll: sm (MCA v1.0, API v1.0, Component v1.1.1)
> MCA coll: tuned (MCA v1.0, API v1.0, Component v1.1.1)
> MCA io: romio (MCA v1.0, API v1.0, Component v1.1.1)
> MCA mpool: gm (MCA v1.0, API v1.0, Component v1.1.1)
> MCA mpool: sm (MCA v1.0, API v1.0, Component v1.1.1)
> MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.1.1)
> MCA bml: r2 (MCA v1.0, API v1.0, Component v1.1.1)
> MCA rcache: rb (MCA v1.0, API v1.0, Component v1.1.1)
> MCA btl: gm (MCA v1.0, API v1.0, Component v1.1.1)
> MCA btl: self (MCA v1.0, API v1.0, Component v1.1.1)
> MCA btl: sm (MCA v1.0, API v1.0, Component v1.1.1)
> MCA btl: tcp (MCA v1.0, API v1.0, Component v1.0)
> MCA topo: unity (MCA v1.0, API v1.0, Component v1.1.1)
> MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.0)
> MCA gpr: null (MCA v1.0, API v1.0, Component v1.1.1)
> MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.1.1)
> MCA gpr: replica (MCA v1.0, API v1.0, Component v1.1.1)
> MCA iof: proxy (MCA v1.0, API v1.0, Component v1.1.1)
> MCA iof: svc (MCA v1.0, API v1.0, Component v1.1.1)
> MCA ns: proxy (MCA v1.0, API v1.0, Component v1.1.1)
> MCA ns: replica (MCA v1.0, API v1.0, Component v1.1.1)
> MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
> MCA ras: dash_host (MCA v1.0, API v1.0, Component v1.1.1)
> MCA ras: hostfile (MCA v1.0, API v1.0, Component v1.1.1)
> MCA ras: localhost (MCA v1.0, API v1.0, Component v1.1.1)
> MCA rds: hostfile (MCA v1.0, API v1.0, Component v1.1.1)
> MCA rds: resfile (MCA v1.0, API v1.0, Component v1.1.1)
> MCA rmaps: round_robin (MCA v1.0, API v1.0, Component v1.1.1)
> MCA rmgr: proxy (MCA v1.0, API v1.0, Component v1.1.1)
> MCA rmgr: urm (MCA v1.0, API v1.0, Component v1.1.1)
> MCA rml: oob (MCA v1.0, API v1.0, Component v1.1.1)
> MCA pls: fork (MCA v1.0, API v1.0, Component v1.1.1)
> MCA pls: rsh (MCA v1.0, API v1.0, Component v1.1.1)
> MCA sds: env (MCA v1.0, API v1.0, Component v1.1.1)
> MCA sds: pipe (MCA v1.0, API v1.0, Component v1.1.1)
> MCA sds: seed (MCA v1.0, API v1.0, Component v1.1.1)
> MCA sds: singleton (MCA v1.0, API v1.0, Component v1.1.1)
>
>
>
> I found this link via google:
>
> http://www.open-mpi.org/community/lists/users/2006/06/1486.php
>
> But to be honest I'm not sure how to apply this to fix my problem.
>
> Thanks!
>
> Jeff
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>