Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [EXTERNAL] 1.7.4rc2r30031 - OpenBSD-5 mpirun hangs
From: Paul Hargrove (phhargrove_at_[hidden])
Date: 2013-12-20 17:48:03


Brian,

Of course, I should have thought of that myself.
See below for backtrace from a singleton run.

I'm starting an --enable-debug build to maybe get some line number info too.

-Paul

(gdb) where
#0 0x00000406457a9e3a in nanosleep () at <stdin>:2
#1 0x000004063947e2d4 in nanosleep (rqtp=0x7f7ffffeca30, rmtp=0x0)
    at /usr/src/lib/librthread/rthread_cancel.c:274
#2 0x0000040644a5a89b in orte_routed_base_register_sync ()
   from
/home/phargrov/OMPI/openmpi-1.7-latest-openbsd5-amd64/INST/lib/libopen-rte.so.7.0
#3 0x00000406490d943c in init_routes ()
   from
/home/phargrov/OMPI/openmpi-1.7-latest-openbsd5-amd64/INST/lib/openmpi/mca_routed_binomial.so
#4 0x0000040644a3c37f in orte_ess_base_app_setup ()
   from
/home/phargrov/OMPI/openmpi-1.7-latest-openbsd5-amd64/INST/lib/libopen-rte.so.7.0
#5 0x000004063eb1797d in rte_init ()
   from
/home/phargrov/OMPI/openmpi-1.7-latest-openbsd5-amd64/INST/lib/openmpi/mca_ess_env.so
#6 0x0000040644a1a3fe in orte_init ()
   from
/home/phargrov/OMPI/openmpi-1.7-latest-openbsd5-amd64/INST/lib/libopen-rte.so.7.0
#7 0x00000406482c7976 in ompi_mpi_init ()
   from
/home/phargrov/OMPI/openmpi-1.7-latest-openbsd5-amd64/INST/lib/libmpi.so.4.0
#8 0x00000406482eac92 in PMPI_Init ()
   from
/home/phargrov/OMPI/openmpi-1.7-latest-openbsd5-amd64/INST/lib/libmpi.so.4.0
#9 0x0000040438c01093 in main (argc=1, argv=0x7f7ffffece60) at ring_c.c:19
Current language: auto; currently asm

On Fri, Dec 20, 2013 at 2:38 PM, Barrett, Brian W <bwbarre_at_[hidden]>wrote:

> Paul -
>
> Any chance you could grab a stack trace from the mpi app? That's probably
> the fastest next step
>
> Brian
>
>
>
> Sent with Good (www.good.com)
>
>
> -----Original Message-----
> *From: *Paul Hargrove [phhargrove_at_[hidden]]
> *Sent: *Friday, December 20, 2013 03:33 PM Mountain Standard Time
> *To: *Open MPI Developers
> *Subject: *[EXTERNAL] [OMPI devel] 1.7.4rc2r30031 - OpenBSD-5 mpirun hangs
>
> With plenty of help from Jeff and Ralph's bug fixes in the past 24 hours,
> I can now build OMPI for NetBSD. However, running even a simple example
> fails:
>
> Having set PATH and LD_LIBARY_PATH:
> $ mpirun -np 1 examples/ring_c
> just hangs
>
> Output from "top" shows idle procs:
> PID USERNAME PRI NICE SIZE RES STATE WAIT TIME CPU
> COMMAND
> 31841 phargrov 10 0 2140K 3960K sleep/1 nanosle 0:00 0.00% ring_c
> 13490 phargrov 2 0 2540K 4892K sleep/1 poll 0:00 0.00% orterun
>
> Distrusting then env vars and relying instead on the auto-prefix
> behavior:
> $ /home/phargrov/OMPI/openmpi-1.7-latest-openbsd5-amd64/INST/bin/mpirun
> -np 1 examples/ring_c
> also hangs
>
> Not sure exactly what to infer from this, but a "bogus" btl doesn't
> produce any complaint, which may indicate how far startup got:
> $ mpirun -mca btl bogus -np 1 examples/ring_c
> Still hangs, and no complaint about the blt selection
>
> All three cases above are singleton (-np 1) runs, but the behavior with
> "-np 2" is the same.
>
> This does NOT appear to be an ORTE problem:
> -bash-4.2$ orterun -np 1 date
> Fri Dec 20 14:11:42 PST 2013
> -bash-4.2$ orterun -np 2 date
> Fri Dec 20 14:11:45 PST 2013
> Fri Dec 20 14:11:45 PST 2013
>
> Let me know what sort of verbose mca parameters to set and I'll collect
> the info.
> Compressed output of "ompi_info --all" is attached.
>
> -Paul
>
> --
> Paul H. Hargrove PHHargrove_at_[hidden]
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

-- 
Paul H. Hargrove                          PHHargrove_at_[hidden]
Future Technologies Group
Computer and Data Sciences Department     Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900