You can eliminate the "[n17:30019] odls_bproc: openpty failed, using
pipes instead" message by configuring OMPI with the --disable-pty-
support flag, as there is a bug in BProc that causes that to happen.
-david
--
David Gunter
HPC-4: HPC Environments: Parallel Tools Team
Los Alamos National Laboratory
On Apr 26, 2007, at 2:06 PM, Daniel Gruner wrote:
> Hi
>
> I have been testing OpenMPI 1.2, and now 1.2.1, on several BProc-
> based clusters, and I have found some problems/issues. All my
> clusters have standard ethernet interconnects, either 100Base/T or
> Gigabit, on standard switches.
>
> The clusters are all running Clustermatic 5 (BProc 4.x), and range
> from 32-bit Athlon, to 32-bit Xeon, to 64-bit Opteron. In all cases
> the same problems occur, identically. I attach here the results
> from "ompi_info --all" and the config.log, for my latest build on
> an Opteron cluster, using the Pathscale compilers. I had exactly
> the same problems when using the vanilla GNU compilers.
>
> Now for a description of the problem:
>
> When running an mpi code (cpi.c, from the standard mpi examples, also
> attached), using the mpirun defaults (e.g. -byslot), with a single
> process:
>
> sonoma:dgruner{134}> mpirun -n 1 ./cpip
> [n17:30019] odls_bproc: openpty failed, using pipes instead
> Process 0 on n17
> pi is approximately 3.1415926544231341, Error is 0.0000000008333410
> wall clock time = 0.000199
>
> However, if one tries to run more than one process, this bombs:
>
> sonoma:dgruner{134}> mpirun -n 2 ./cpip
> .
> .
> .
> [n21:30029] OOB: Connection to HNP lost
> [n21:30029] OOB: Connection to HNP lost
> [n21:30029] OOB: Connection to HNP lost
> [n21:30029] OOB: Connection to HNP lost
> [n21:30029] OOB: Connection to HNP lost
> [n21:30029] OOB: Connection to HNP lost
> .
> . ad infinitum
>
> If one uses de option "-bynode", things work:
>
> sonoma:dgruner{145}> mpirun -bynode -n 2 ./cpip
> [n17:30055] odls_bproc: openpty failed, using pipes instead
> Process 0 on n17
> Process 1 on n21
> pi is approximately 3.1415926544231318, Error is 0.0000000008333387
> wall clock time = 0.010375
>
>
> Note that there is always the message about "openpty failed, using
> pipes instead".
>
> If I run more processes (on my 3-node cluster, with 2 cpus per
> node), the
> openpty message appears repeatedly for the first node:
>
> sonoma:dgruner{146}> mpirun -bynode -n 6 ./cpip
> [n17:30061] odls_bproc: openpty failed, using pipes instead
> [n17:30061] odls_bproc: openpty failed, using pipes instead
> Process 0 on n17
> Process 2 on n49
> Process 1 on n21
> Process 5 on n49
> Process 3 on n17
> Process 4 on n21
> pi is approximately 3.1415926544231239, Error is 0.0000000008333307
> wall clock time = 0.050332
>
>
> Should I worry about the openpty failure? I suspect that
> communications
> may be slower this way. Using the -byslot option always fails, so
> this
> is a bug. The same occurs for all the codes that I have tried,
> both simple
> and complex.
>
> Thanks for your attention to this.
> Regards,
> Daniel
> --
>
> Dr. Daniel Gruner dgruner_at_[hidden]
> Dept. of Chemistry daniel.gruner_at_[hidden]
> University of Toronto phone: (416)-978-8689
> 80 St. George Street fax: (416)-978-5325
> Toronto, ON M5S 3H6, Canada finger for PGP public key
> <cpi.c.gz>
> <config.log.gz>
> <ompiinfo.gz>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
|