Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] 1.4.5rc2 Solaris results [libtool problem]
From: Paul H. Hargrove (PHHargrove_at_[hidden])
Date: 2012-01-20 02:21:41


As promised earlier today, here are results from my Solaris platforms.
Note that there are libtool-related failures below that may be worth
pursuing.
If necessary, access to most of my machines can be arranged for
qualified persons.

== GNU compilers with {C,CXX,F77,FC}FLAGS=-mcpu=v9 on SPARCs, and -m64
on amd64

PASS:
     solaris-10 s10_69/sun4u (w/ g77, no FC)
     solaris-10 Generic_142901-03/i386 (w/ Sun's f77 and f95, both dated
April 2009)
     solaris-11 snv_151a/amd64 [including ofud, openib and dapl] (w/
g77, no FC)

FAIL:
     solaris-10 Generic_137111-07/sun4v with default GNU compilers
Using system default gcc, which is actually Sun's gccfss-4.0.4, there
are assertion failures seen in the atomics in "make check". I can
provide details is anybody cares, but I know from past experience that
support for gcc-style inline asm is marginal in this compiler.

== Sun Studio 12.2 compilers w/ {C,CXX,F77,FC}=-m64 on SPARCs and amd64

Both of my SPARC systems appear to have an out-of-date libmtsk.so, which
both prevents the Sun f77 and f90 compilers from running at all, and
additionally leads to failure like the following when building OpenMP
support in VT:
> /bin/bash ../../libtool --tag=CXX --mode=link sunCC -xopenmp
> -DVT_OMP -m64 -xopenmp -o vtfilter vtfilter-vt_filter.o
> vtfilter-vt_filthandler.o vtfilter-vt_otfhandler.o
> vtfilter-vt_tracefilter.o ../../util/util.o -L../../extlib/otf/otflib
> -L../../extlib/otf/otflib/.libs -lotf -lz -lsocket -lnsl -lrt -lm
> -lthread
> libtool: link: sunCC -xopenmp -DVT_OMP -m64 -xopenmp -o vtfilter
> vtfilter-vt_filter.o vtfilter-vt_filthandler.o
> vtfilter-vt_otfhandler.o vtfilter-vt_tracefilter.o ../../util/util.o
> -L/home/hargrove/OMPI/openmpi-1.4.5rc2-solaris10-sparcT2-ss12u2/BLD/ompi/contrib/vt/vt/extlib/otf/otflib/.libs
> -L/home/hargrove/OMPI/openmpi-1.4.5rc2-solaris10-sparcT2-ss12u2/BLD/ompi/contrib/vt/vt/extlib/otf/otflib
> /home/hargrove/OMPI/openmpi-1.4.5rc2-solaris10-sparcT2-ss12u2/BLD/ompi/contrib/vt/vt/extlib/otf/otflib/.libs/libotf.a
> -lz -lsocket -lnsl -lrt -lm -lthread
> CC: Warning: Optimizer level changed from 0 to 3 to support
> parallelized code.
> Undefined first referenced
> symbol in file
> __mt_MasterFunction_cxt_ vtfilter-vt_tracefilter.o
> ld: fatal: Symbol referencing errors. No output written to vtfilter
> *** Error code 2
This is a lack of required Solaris patches and NOT an ompi or vt problem
to be solved.
However, as a result my two SPARC platforms are configured with
    --disable-mpi-f77 --disable-mpi-f90
--with-contrib-vt-flags="--disable-omp --disable-hyb"
[It took a bit of work to figure out how disable OMP and not just VT in
its entirety.]
I report this info just to note that my SPARC testing is "narrower" than
on my x86 and amd64 machines.

The one "real" problem I found appears to be libtool related and
impacted all 4 platforms:
     solaris-10 s10_69/sun4u
     solaris-10 Generic_142901-03/i386
     solaris-11 snv_151a/amd64 [including ofud, openib and dapl]
     solaris-10 Generic_137111-07/sun4v
No problem with "make all" or with "make check", but "make install"
fails with:
> Making install in mpi/cxx
> make[2]: Entering directory
> `/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/BLD/ompi/mpi/cxx'
> make[3]: Entering directory
> `/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/BLD/ompi/mpi/cxx'
> test -z
> "/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/INST/lib"
> || /usr/gnu/bin/mkdir -p
> "/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/INST/lib"
> /bin/sh ../../../libtool --mode=install /usr/bin/ginstall -c
> 'libmpi_cxx.la'
> '/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/INST/lib/libmpi_cxx.la'
> libtool: install: warning: relinking `libmpi_cxx.la'
> libtool: install: (cd
> /home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/BLD/ompi/mpi/cxx;
> /bin/sh
> /home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/BLD/libtool
> --tag CXX --mode=relink sunCC -O -DNDEBUG -m64 -version-info 0:1:0
> -export-dynamic -o libmpi_cxx.la -rpath
> /home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/INST/lib
> mpicxx.lo intercepts.lo comm.lo datatype.lo win.lo file.lo
> ../../../ompi/libmpi.la -lsocket -lnsl -lm -lthread )
> mv: cannot stat `libmpi_cxx.so.0.0.1': No such file or directory
> libtool: install: error: relink `libmpi_cxx.la' with the above command
> before installing it
> make[3]: *** [install-libLTLIBRARIES] Error 1
> make[3]: Leaving directory
> `/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/BLD/ompi/mpi/cxx'
> make[2]: *** [install-am] Error 2
> make[2]: Leaving directory
> `/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/BLD/ompi/mpi/cxx'
> make[1]: *** [install-recursive] Error 1
> make[1]: Leaving directory
> `/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/BLD/ompi'
> make: *** [install-recursive] Error 1
No such problem was seen w/ the GNU compilers on the same 4 systems.
This looks to be a libtool bug in support for the Sun C++ compiler,
especially since configuring with "--enable-static --disable-shared"
eliminates this problem (but is undesirable for obvious reasons).
A quick peek appears to show some dangling symlinks:
> $ ls -l ompi/mpi/cxx/.libs/
> total 869
> -rw-r--r-- 1 phargrov staff 116944 2012-01-19 18:09 comm.o
> -rw-r--r-- 1 phargrov staff 41644 2012-01-19 18:09 datatype.o
> -rw-r--r-- 1 phargrov staff 17240 2012-01-19 18:09 file.o
> -rw-r--r-- 1 phargrov staff 222592 2012-01-19 18:09 intercepts.o
> lrwxrwxrwx 1 phargrov staff 16 2012-01-19 18:09 libmpi_cxx.la ->
> ../libmpi_cxx.la
> -rw-r--r-- 1 phargrov staff 1262 2012-01-19 18:09 libmpi_cxx.lai
> lrwxrwxrwx 1 phargrov staff 19 2012-01-19 18:09 libmpi_cxx.so ->
> libmpi_cxx.so.0.0.1
> lrwxrwxrwx 1 phargrov staff 19 2012-01-19 18:09 libmpi_cxx.so.0 ->
> libmpi_cxx.so.0.0.1
> -rw-r--r-- 1 phargrov staff 267364 2012-01-19 18:09 mpicxx.o
> -rw-r--r-- 1 phargrov staff 46660 2012-01-19 18:09 win.o
It is possible that the library build failed in "make all" w/o
terminating the build (I've seen such things before).
The initial evidence in the "make all" output does suggest no shared lib
was built:
> /bin/sh ../../../libtool --tag=CXX --mode=link sunCC -O -DNDEBUG
> -m64 -version-info 0:1:0 -export-dynamic -o libmpi_cxx.la -rpath
> /home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-ss12u2/INST/lib
> mpicxx.lo intercepts.lo comm.lo datatype.lo win.lo file.lo
> ../../../ompi/libmpi.la -lsocket -lnsl -lm -lthread
> libtool: link: (cd ".libs" && rm -f "libmpi_cxx.so.0" && ln -s
> "libmpi_cxx.so.0.0.1" "libmpi_cxx.so.0")
> libtool: link: (cd ".libs" && rm -f "libmpi_cxx.so" && ln -s
> "libmpi_cxx.so.0.0.1" "libmpi_cxx.so")
> libtool: link: ( cd ".libs" && rm -f "libmpi_cxx.la" && ln -s
> "../libmpi_cxx.la" "libmpi_cxx.la" )
Note the lack of any suncc or sunCC command beween the libtool command
line and the "rm && ln" commands.
Inspecting the configure-generated libtool confirms what looks like
improper config for tag=CXX:
> $ grep -e "BEGIN LIBTOOL TAG CONFIG: [FC]" -e ^archive_cmds libtool
> archive_cmds="\$CC -G\${allow_undefined_flag} -h \$soname -o \$lib
> \$libobjs \$deplibs \$compiler_flags"
> # ### BEGIN LIBTOOL TAG CONFIG: CXX
> archive_cmds=""
> # ### BEGIN LIBTOOL TAG CONFIG: F77
> archive_cmds="\$CC -G\${allow_undefined_flag} -h \$soname -o \$lib
> \$libobjs \$deplibs \$compiler_flags"
> # ### BEGIN LIBTOOL TAG CONFIG: FC
> archive_cmds="\$CC -G\${allow_undefined_flag} -h \$soname -o \$lib
> \$libobjs \$deplibs \$compiler_flags"
I'll be happy to provide all or part of config.log to Ralf W. or other
interested persons to debug this.

I suppose I could have tried w/o C++ bindings instead of disabling
libtool, but with F77 and F90 bindings already disabled on the SPARCs
that didn't feel to me like a very good use of my time.

An additional annoyance on one platform:
     solaris-10 Generic_142901-03/i386
Is additionally unhappy with the atomics, yielding the following
warnings for every file that include atomic.h:
> "/export/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris10-x86-ss12u2//openmpi-1.4.5rc2/opal/include/opal/sys/ia32/atomic.h",
> line 170: warning: impossible constraint for "%1" asm operand
> "/export/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris10-x86-ss12u2//openmpi-1.4.5rc2/opal/include/opal/sys/ia32/atomic.h",
> line 170: warning: parameter in inline asm statement unused: %2
> "/export/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris10-x86-ss12u2//openmpi-1.4.5rc2/opal/include/opal/sys/ia32/atomic.h",
> line 187: warning: impossible constraint for "%1" asm operand
> "/export/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris10-x86-ss12u2//openmpi-1.4.5rc2/opal/include/opal/sys/ia32/atomic.h",
> line 187: warning: parameter in inline asm statement unused: %2
This is annoying, but "make check" passes all tests.

-Paul

-- 
Paul H. HargrovePHHargrove_at_[hidden]
Future Technologies Group
HPC Research Department                   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900