Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] 1.7.4rc3 static link failure on Solaris
From: Paul Hargrove (phhargrove_at_[hidden])
Date: 2014-02-03 18:24:07


Fix confirmed.

I've confirmed that:
a) the v1.7 branch no longer fails to build due to undefined MB
b) that there is no VT issue - Nathan's patch fixed the root cause of the
problem

-Paul

On Mon, Feb 3, 2014 at 10:18 AM, Ralph Castain <rhc_at_[hidden]> wrote:

> Thanks Nathan - I fixed this in 1.7.4, and I've sent a note to the VT guys
> about the problem there. However, I won't hold things up just for the VT
> fix - we can catch it for 1.7.5
>
> Thanks Paul!
> Ralph
>
> On Feb 3, 2014, at 9:07 AM, Nathan Hjelm <hjelmn_at_[hidden]> wrote:
>
> > basesmuma is calling MB directly instead of calling
> > opal_atomic_[rw]mb. I fixes this in trunk and the same thing could be
> > done in 1.7 with a simple query-replace MB -> opal_atomic_wmb. ORNL was
> > using MB because opal_atomic_[rw]mb is a no-op on some platforms. I
> > don't think this should be an issue since memory access should be
> > ordered if opal_atomic_[rw]mb is a no-op. If not we should fix that in
> opal.
> >
> > -Nathan
> >
> > On Sun, Feb 02, 2014 at 01:33:41PM -0800, Paul Hargrove wrote:
> >> Following up on my previous reports and using 1.7.4rc3:
> >> The error I see only occurs with --enable-static.
> >> When I do enable static libs, I get a build failure when linking
> >> otfmerge-mpi, due to undefined symbol "MB".
> >> When building with gcc:
> >> CCLD otfmerge-mpi
> >> gcc: unrecognized option `-pthread'
> >> Undefined first referenced
> >> symbol in file
> >> MB
> >>
> /home/hargrove/OMPI/openmpi-1.7.4rc3-solaris10-sparcT2-gcc346-v9/BLD/ompi/contrib/vt/vt/../../../.libs/libmpi.so
> >> ld: fatal: Symbol referencing errors. No output written to
> >> .libs/otfmerge-mpi
> >> collect2: ld returned 1 exit status
> >> *** Error code 1
> >> When building with Solaris Studio 12.3 compilers:
> >> CCLD otfmerge-mpi
> >> Undefined first referenced
> >> symbol in file
> >> MB
> >>
> /home/hargrove/OMPI/openmpi-1.7.4rc2-solaris10-sparcT2-ss12u3-v9/BLD/ompi/contrib/vt/vt/../../../.libs/libmpi.so
> >> ld: fatal: Symbol referencing errors. No output written to
> >> .libs/otfmerge-mpi
> >> *** Error code 2
> >> This is independent of ABI (v9 vs v8plus).
> >> If I avoid otfmerge-mpi by configuring with --disable-vt, then the
> link
> >> failure occurs building ompi_info instead.
> >> So, I don't think this is a vt-specific problem. Consistent with
> that, I
> >> found the following warnings in the make output:
> >>
> "/home/hargrove/OMPI/openmpi-1.7.4rc2-solaris10-sparcT2-ss12u3-v9/openmpi-1.7.4rc2/ompi/mca/bcol/basesmuma/bcol_basesmuma_bcast.c",
> >> line 183: warning: implicit function declaration: MB
> >>
> "/home/hargrove/OMPI/openmpi-1.7.4rc2-solaris10-sparcT2-ss12u3-v9/openmpi-1.7.4rc2/ompi/mca/bcol/basesmuma/bcol_basesmuma_fanin.c",
> >> line 66: warning: implicit function declaration: MB
> >>
> "/home/hargrove/OMPI/openmpi-1.7.4rc2-solaris10-sparcT2-ss12u3-v9/openmpi-1.7.4rc2/ompi/mca/bcol/basesmuma/bcol_basesmuma_fanout.c",
> >> line 64: warning: implicit function declaration: MB
> >>
> "/home/hargrove/OMPI/openmpi-1.7.4rc2-solaris10-sparcT2-ss12u3-v9/openmpi-1.7.4rc2/ompi/mca/bcol/basesmuma/bcol_basesmuma_rk_barrier.c",
> >> line 97: warning: implicit function declaration: MB
> >>
> "/home/hargrove/OMPI/openmpi-1.7.4rc2-solaris10-sparcT2-ss12u3-v9/openmpi-1.7.4rc2/ompi/mca/bcol/basesmuma/bcol_basesmuma_rd_nb_barrier.c",
> >> line 75: warning: implicit function declaration: MB
> >>
> "/home/hargrove/OMPI/openmpi-1.7.4rc2-solaris10-sparcT2-ss12u3-v9/openmpi-1.7.4rc2/ompi/mca/bcol/basesmuma/bcol_basesmuma_bcast_prime.c",
> >> line 156: warning: implicit function declaration: MB
> >> That is all the warnings I see regarding MB (all in bcoll/basesmuma).
> >> -Paul
> >>
> >> On Wed, Jan 29, 2014 at 2:17 PM, Paul Hargrove <phhargrove_at_[hidden]>
> wrote:
> >>
> >> On Wed, Jan 29, 2014 at 9:19 AM, Paul Hargrove <phhargrove_at_[hidden]>
> >> wrote:
> >>
> >> For Solaris-10 with the Solaris Studio 12.3 compilers on SPARC I
> have
> >> encountered a link failure when configured with "--enable-static
> >> --enable-shared" (fine w/o "--enable-static"). I have not yet
> tried
> >> this configuration with gcc. I have started builds of 1.7.3 to
> >> determine if this is a regression or not before investing more
> deeply.
> >> I hope to be able to report more tonight.
> >>
> >> The problem is also present in 1.7.3 and thus NOT a (recent)
> regression.
> >> More information will follow eventually, but knowing that this
> problem
> >> isn't new significantly reduces the urgency (at least for me).
> >> -Paul
> >> --
> >> Paul H. Hargrove PHHargrove_at_[hidden]
> >> Future Technologies Group
> >> Computer and Data Sciences Department Tel: +1-510-495-2352
> >> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> >>
> >> --
> >> Paul H. Hargrove PHHargrove_at_[hidden]
> >> Future Technologies Group
> >> Computer and Data Sciences Department Tel: +1-510-495-2352
> >> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> >
> >> _______________________________________________
> >> devel mailing list
> >> devel_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

-- 
Paul H. Hargrove                          PHHargrove_at_[hidden]
Future Technologies Group
Computer and Data Sciences Department     Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900