Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

From: Jeffrey B. Layton (laytonjb_at_[hidden])
Date: 2006-10-19 15:31:46


A small update. I was looking through the error file a bit more
(it was 159MB). I found the following error message sequence:

o1:22805] mca_oob_tcp_accept: accept() failed with errno 9.
[o4:11242] [0,1,4]-[0,0,0] mca_oob_tcp_peer_recv_blocking: recv() failed
with errno=104
[o1:22805] mca_oob_tcp_accept: accept() failed with errno 9.
...
[o1:22805] mca_oob_tcp_accept: accept() failed with errno 9.
[o3:32205] [0,1,2]-[0,0,0] mca_oob_tcp_peer_complete_connect: connection
failed (errno=111) - retrying (pid=32205)
[o1:22805] mca_oob_tcp_accept: accept() failed with errno 9.
[o3:32206] [0,1,3]-[0,0,0] mca_oob_tcp_peer_complete_connect: connection
failed (errno=111) - retrying (pid=32206)
[o1:22805] mca_oob_tcp_accept: accept() failed with errno 9.
...

I don't know if this changes things (my google attempts didn't
really give me much information).

Jeff

> Good afternoon,
>
> I really hate to post asking for help with a problem, but
> my own efforts have not worked out well (probably operator
> error).
> Anyway, I'm trying to run a code that was built with PGI 6.1
> and OpenMPI-1.1.1. The mpirun command looks like:
>
> mpirun --hostfile machines.${PBS_JOBID} --np ${NP} -mca btl self,sm,tcp
> ./${EXE} ${CASEPROJ} >> OUTPUT
>
> I get the following error in the PBS error file:
>
> [o1:22559] mca_oob_tcp_accept: accept() failed with errno 9.
> ...
>
> and keeps repeating (for a long time).
>
> ompi_info gives the following output:
>
> > ompi_info
> Open MPI: 1.1.1
> Open MPI SVN revision: r11473
> Open RTE: 1.1.1
> Open RTE SVN revision: r11473
> OPAL: 1.1.1
> OPAL SVN revision: r11473
> Prefix: /usr/x86_64-pgi-6.1/openmpi-1.1.1
> Configured architecture: x86_64-suse-linux-gnu
> Configured by: root
> Configured on: Mon Oct 16 20:51:34 MDT 2006
> Configure host: lo248
> Built by: root
> Built on: Mon Oct 16 21:02:00 MDT 2006
> Built host: lo248
> C bindings: yes
> C++ bindings: yes
> Fortran77 bindings: yes (all)
> Fortran90 bindings: yes
> Fortran90 bindings size: small
> C compiler: pgcc
> C compiler absolute: /opt/pgi/linux86-64/6.1/bin/pgcc
> C++ compiler: pgCC
> C++ compiler absolute: /opt/pgi/linux86-64/6.1/bin/pgCC
> Fortran77 compiler: pgf77
> Fortran77 compiler abs: /opt/pgi/linux86-64/6.1/bin/pgf77
> Fortran90 compiler: pgf90
> Fortran90 compiler abs: /opt/pgi/linux86-64/6.1/bin/pgf90
> C profiling: yes
> C++ profiling: yes
> Fortran77 profiling: yes
> Fortran90 profiling: yes
> C++ exceptions: yes
> Thread support: posix (mpi: no, progress: no)
> Internal debug support: no
> MPI parameter check: runtime
> Memory profiling support: no
> Memory debugging support: no
> libltdl support: yes
> MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component v1.1.1)
> MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.1.1)
> MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.1.1)
> MCA maffinity: libnuma (MCA v1.0, API v1.0, Component v1.1.1)
> MCA timer: linux (MCA v1.0, API v1.0, Component v1.1.1)
> MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
> MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
> MCA coll: basic (MCA v1.0, API v1.0, Component v1.1.1)
> MCA coll: hierarch (MCA v1.0, API v1.0, Component v1.1.1)
> MCA coll: self (MCA v1.0, API v1.0, Component v1.1.1)
> MCA coll: sm (MCA v1.0, API v1.0, Component v1.1.1)
> MCA coll: tuned (MCA v1.0, API v1.0, Component v1.1.1)
> MCA io: romio (MCA v1.0, API v1.0, Component v1.1.1)
> MCA mpool: gm (MCA v1.0, API v1.0, Component v1.1.1)
> MCA mpool: sm (MCA v1.0, API v1.0, Component v1.1.1)
> MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.1.1)
> MCA bml: r2 (MCA v1.0, API v1.0, Component v1.1.1)
> MCA rcache: rb (MCA v1.0, API v1.0, Component v1.1.1)
> MCA btl: gm (MCA v1.0, API v1.0, Component v1.1.1)
> MCA btl: self (MCA v1.0, API v1.0, Component v1.1.1)
> MCA btl: sm (MCA v1.0, API v1.0, Component v1.1.1)
> MCA btl: tcp (MCA v1.0, API v1.0, Component v1.0)
> MCA topo: unity (MCA v1.0, API v1.0, Component v1.1.1)
> MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.0)
> MCA gpr: null (MCA v1.0, API v1.0, Component v1.1.1)
> MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.1.1)
> MCA gpr: replica (MCA v1.0, API v1.0, Component v1.1.1)
> MCA iof: proxy (MCA v1.0, API v1.0, Component v1.1.1)
> MCA iof: svc (MCA v1.0, API v1.0, Component v1.1.1)
> MCA ns: proxy (MCA v1.0, API v1.0, Component v1.1.1)
> MCA ns: replica (MCA v1.0, API v1.0, Component v1.1.1)
> MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
> MCA ras: dash_host (MCA v1.0, API v1.0, Component v1.1.1)
> MCA ras: hostfile (MCA v1.0, API v1.0, Component v1.1.1)
> MCA ras: localhost (MCA v1.0, API v1.0, Component v1.1.1)
> MCA rds: hostfile (MCA v1.0, API v1.0, Component v1.1.1)
> MCA rds: resfile (MCA v1.0, API v1.0, Component v1.1.1)
> MCA rmaps: round_robin (MCA v1.0, API v1.0, Component v1.1.1)
> MCA rmgr: proxy (MCA v1.0, API v1.0, Component v1.1.1)
> MCA rmgr: urm (MCA v1.0, API v1.0, Component v1.1.1)
> MCA rml: oob (MCA v1.0, API v1.0, Component v1.1.1)
> MCA pls: fork (MCA v1.0, API v1.0, Component v1.1.1)
> MCA pls: rsh (MCA v1.0, API v1.0, Component v1.1.1)
> MCA sds: env (MCA v1.0, API v1.0, Component v1.1.1)
> MCA sds: pipe (MCA v1.0, API v1.0, Component v1.1.1)
> MCA sds: seed (MCA v1.0, API v1.0, Component v1.1.1)
> MCA sds: singleton (MCA v1.0, API v1.0, Component v1.1.1)
>
>
>
> I found this link via google:
>
> http://www.open-mpi.org/community/lists/users/2006/06/1486.php
>
> But to be honest I'm not sure how to apply this to fix my problem.
>
> Thanks!
>
> Jeff
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>