Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] SOLVED: 1.4.5rc2 Solaris results [libtool problem]
From: TERRY DONTJE (terry.dontje_at_[hidden])
Date: 2012-01-26 05:10:26


This is awesome Paul, thanks a lot! I'll put in some verbage into the
README and submit a CMR.

--td

On 1/26/2012 2:49 AM, Paul H. Hargrove wrote:
> I am pleased to report that w/ help from Terry I can now build nearly
> everything w/ the Solaris Studio 12.2 and 12.3 compilers.
> Upon comparing our build environments we discovered that CXX=CC works
> but CXX=sunCC does not, even though they are both symlinks to the same
> compiler executable. I still don't know the root cause (though
> libtool and associated configure logic is still the obvious suspect),
> but the work around is simple:
> When using the Solaris Studio compilers on Solaris, one must set
> CXX=CC rather than CXX=sunCC.
>
> So I am following that advice, and have additionally:
> + gotten up-to-date patches applied to resolve my FORTRAN and OMP
> issues on the SPARC-T2 system.
> + installed both 12.2 and 12.3 compilers on Linux/x86-64
>
> So, I can now report the following ALL work (defined as "make all
> check install") with BOTH 12.2 and 12.3 Solaris Studio compilers.
> The only configure flags are --prefix, setting the CC, CXX, F77 and FC
> variables, and (when appropriate) setting *FLAGS=-m64.
> solaris-10 s10_69/sun4u (w/ -m64)
> solaris-10 Generic_137111-07/sun4v (w/ -m64)
> solaris-11 snv_151a/amd64 [including ofud, openib and dapl] (w/ -m64)
> linux/x86-64 (no -m64 needed because it is the default)
>
> The following works w/ the 12.2 compilers:
> solaris-10 Generic_142901-03/i386
> However, the f77/f90 compilers in 12.3 are generating code using SSE2
> instructions even when passed -xarch=pentium_pro.
> So this machine cannot run the resulting executables. So, I had to
> --disable-mpi-f77 to get things to work.
> That, however, is NOT an OMPI problem.
>
> -Paul
>
> On 1/19/2012 11:21 PM, Paul H. Hargrove wrote:
>> As promised earlier today, here are results from my Solaris platforms.
>> Note that there are libtool-related failures below that may be worth
>> pursuing.
>> If necessary, access to most of my machines can be arranged for
>> qualified persons.
>>
>> == GNU compilers with {C,CXX,F77,FC}FLAGS=-mcpu=v9 on SPARCs, and
>> -m64 on amd64
>>
>> PASS:
>> solaris-10 s10_69/sun4u (w/ g77, no FC)
>> solaris-10 Generic_142901-03/i386 (w/ Sun's f77 and f95, both
>> dated April 2009)
>> solaris-11 snv_151a/amd64 [including ofud, openib and dapl] (w/
>> g77, no FC)
>>
>> FAIL:
>> solaris-10 Generic_137111-07/sun4v with default GNU compilers
>> Using system default gcc, which is actually Sun's gccfss-4.0.4, there
>> are assertion failures seen in the atomics in "make check". I can
>> provide details is anybody cares, but I know from past experience
>> that support for gcc-style inline asm is marginal in this compiler.
>>
>> == Sun Studio 12.2 compilers w/ {C,CXX,F77,FC}=-m64 on SPARCs and amd64
>>
>> Both of my SPARC systems appear to have an out-of-date libmtsk.so,
>> which both prevents the Sun f77 and f90 compilers from running at
>> all, and additionally leads to failure like the following when
>> building OpenMP support in VT:
>>> /bin/bash ../../libtool --tag=CXX --mode=link sunCC -xopenmp
>>> -DVT_OMP -m64 -xopenmp -o vtfilter vtfilter-vt_filter.o
>>> vtfilter-vt_filthandler.o vtfilter-vt_otfhandler.o
>>> vtfilter-vt_tracefilter.o ../../util/util.o
>>> -L../../extlib/otf/otflib -L../../extlib/otf/otflib/.libs -lotf -lz
>>> -lsocket -lnsl -lrt -lm -lthread
>>> libtool: link: sunCC -xopenmp -DVT_OMP -m64 -xopenmp -o vtfilter
>>> vtfilter-vt_filter.o vtfilter-vt_filthandler.o
>>> vtfilter-vt_otfhandler.o vtfilter-vt_tracefilter.o
>>> ../../util/util.o
>>> -L/home/hargrove/OMPI/openmpi-1.4.5rc2-solaris10-sparcT2-ss12u2/BLD/ompi/contrib/vt/vt/extlib/otf/otflib/.libs
>>> -L/home/hargrove/OMPI/openmpi-1.4.5rc2-solaris10-sparcT2-ss12u2/BLD/ompi/contrib/vt/vt/extlib/otf/otflib
>>> /home/hargrove/OMPI/openmpi-1.4.5rc2-solaris10-sparcT2-ss12u2/BLD/ompi/contrib/vt/vt/extlib/otf/otflib/.libs/libotf.a
>>> -lz -lsocket -lnsl -lrt -lm -lthread
>>> CC: Warning: Optimizer level changed from 0 to 3 to support
>>> parallelized code.
>>> Undefined first referenced
>>> symbol in file
>>> __mt_MasterFunction_cxt_ vtfilter-vt_tracefilter.o
>>> ld: fatal: Symbol referencing errors. No output written to vtfilter
>>> *** Error code 2
>> This is a lack of required Solaris patches and NOT an ompi or vt
>> problem to be solved.
>> However, as a result my two SPARC platforms are configured with
>> --disable-mpi-f77 --disable-mpi-f90
>> --with-contrib-vt-flags="--disable-omp --disable-hyb"
>> [It took a bit of work to figure out how disable OMP and not just VT
>> in its entirety.]
>> I report this info just to note that my SPARC testing is "narrower"
>> than on my x86 and amd64 machines.
>>
>> The one "real" problem I found appears to be libtool related and
>> impacted all 4 platforms:
>> solaris-10 s10_69/sun4u
>> solaris-10 Generic_142901-03/i386
>> solaris-11 snv_151a/amd64 [including ofud, openib and dapl]
>> solaris-10 Generic_137111-07/sun4v
>> No problem with "make all" or with "make check", but "make install"
>> fails with:
>>> Making install in mpi/cxx
>>> make[2]: Entering directory
>>> `/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/BLD/ompi/mpi/cxx'
>>> make[3]: Entering directory
>>> `/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/BLD/ompi/mpi/cxx'
>>> test -z
>>> "/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/INST/lib"
>>> || /usr/gnu/bin/mkdir -p
>>> "/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/INST/lib"
>>> /bin/sh ../../../libtool --mode=install /usr/bin/ginstall -c
>>> 'libmpi_cxx.la'
>>> '/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/INST/lib/libmpi_cxx.la'
>>> libtool: install: warning: relinking `libmpi_cxx.la'
>>> libtool: install: (cd
>>> /home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/BLD/ompi/mpi/cxx;
>>> /bin/sh
>>> /home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/BLD/libtool
>>> --tag CXX --mode=relink sunCC -O -DNDEBUG -m64 -version-info 0:1:0
>>> -export-dynamic -o libmpi_cxx.la -rpath
>>> /home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/INST/lib
>>> mpicxx.lo intercepts.lo comm.lo datatype.lo win.lo file.lo
>>> ../../../ompi/libmpi.la -lsocket -lnsl -lm -lthread )
>>> mv: cannot stat `libmpi_cxx.so.0.0.1': No such file or directory
>>> libtool: install: error: relink `libmpi_cxx.la' with the above
>>> command before installing it
>>> make[3]: *** [install-libLTLIBRARIES] Error 1
>>> make[3]: Leaving directory
>>> `/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/BLD/ompi/mpi/cxx'
>>> make[2]: *** [install-am] Error 2
>>> make[2]: Leaving directory
>>> `/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/BLD/ompi/mpi/cxx'
>>> make[1]: *** [install-recursive] Error 1
>>> make[1]: Leaving directory
>>> `/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/BLD/ompi'
>>> make: *** [install-recursive] Error 1
>> No such problem was seen w/ the GNU compilers on the same 4 systems.
>> This looks to be a libtool bug in support for the Sun C++ compiler,
>> especially since configuring with "--enable-static --disable-shared"
>> eliminates this problem (but is undesirable for obvious reasons).
>> A quick peek appears to show some dangling symlinks:
>>> $ ls -l ompi/mpi/cxx/.libs/
>>> total 869
>>> -rw-r--r-- 1 phargrov staff 116944 2012-01-19 18:09 comm.o
>>> -rw-r--r-- 1 phargrov staff 41644 2012-01-19 18:09 datatype.o
>>> -rw-r--r-- 1 phargrov staff 17240 2012-01-19 18:09 file.o
>>> -rw-r--r-- 1 phargrov staff 222592 2012-01-19 18:09 intercepts.o
>>> lrwxrwxrwx 1 phargrov staff 16 2012-01-19 18:09 libmpi_cxx.la ->
>>> ../libmpi_cxx.la
>>> -rw-r--r-- 1 phargrov staff 1262 2012-01-19 18:09 libmpi_cxx.lai
>>> lrwxrwxrwx 1 phargrov staff 19 2012-01-19 18:09 libmpi_cxx.so ->
>>> libmpi_cxx.so.0.0.1
>>> lrwxrwxrwx 1 phargrov staff 19 2012-01-19 18:09 libmpi_cxx.so.0
>>> -> libmpi_cxx.so.0.0.1
>>> -rw-r--r-- 1 phargrov staff 267364 2012-01-19 18:09 mpicxx.o
>>> -rw-r--r-- 1 phargrov staff 46660 2012-01-19 18:09 win.o
>> It is possible that the library build failed in "make all" w/o
>> terminating the build (I've seen such things before).
>> The initial evidence in the "make all" output does suggest no shared
>> lib was built:
>>> /bin/sh ../../../libtool --tag=CXX --mode=link sunCC -O -DNDEBUG
>>> -m64 -version-info 0:1:0 -export-dynamic -o libmpi_cxx.la -rpath
>>> /home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-ss12u2/INST/lib mpicxx.lo
>>> intercepts.lo comm.lo datatype.lo win.lo file.lo
>>> ../../../ompi/libmpi.la -lsocket -lnsl -lm -lthread
>>> libtool: link: (cd ".libs" && rm -f "libmpi_cxx.so.0" && ln -s
>>> "libmpi_cxx.so.0.0.1" "libmpi_cxx.so.0")
>>> libtool: link: (cd ".libs" && rm -f "libmpi_cxx.so" && ln -s
>>> "libmpi_cxx.so.0.0.1" "libmpi_cxx.so")
>>> libtool: link: ( cd ".libs" && rm -f "libmpi_cxx.la" && ln -s
>>> "../libmpi_cxx.la" "libmpi_cxx.la" )
>> Note the lack of any suncc or sunCC command beween the libtool
>> command line and the "rm && ln" commands.
>> Inspecting the configure-generated libtool confirms what looks like
>> improper config for tag=CXX:
>>> $ grep -e "BEGIN LIBTOOL TAG CONFIG: [FC]" -e ^archive_cmds libtool
>>> archive_cmds="\$CC -G\${allow_undefined_flag} -h \$soname -o \$lib
>>> \$libobjs \$deplibs \$compiler_flags"
>>> # ### BEGIN LIBTOOL TAG CONFIG: CXX
>>> archive_cmds=""
>>> # ### BEGIN LIBTOOL TAG CONFIG: F77
>>> archive_cmds="\$CC -G\${allow_undefined_flag} -h \$soname -o \$lib
>>> \$libobjs \$deplibs \$compiler_flags"
>>> # ### BEGIN LIBTOOL TAG CONFIG: FC
>>> archive_cmds="\$CC -G\${allow_undefined_flag} -h \$soname -o \$lib
>>> \$libobjs \$deplibs \$compiler_flags"
>> I'll be happy to provide all or part of config.log to Ralf W. or
>> other interested persons to debug this.
>>
>> I suppose I could have tried w/o C++ bindings instead of disabling
>> libtool, but with F77 and F90 bindings already disabled on the SPARCs
>> that didn't feel to me like a very good use of my time.
>>
>>
>> An additional annoyance on one platform:
>> solaris-10 Generic_142901-03/i386
>> Is additionally unhappy with the atomics, yielding the following
>> warnings for every file that include atomic.h:
>>> "/export/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris10-x86-ss12u2//openmpi-1.4.5rc2/opal/include/opal/sys/ia32/atomic.h",
>>> line 170: warning: impossible constraint for "%1" asm operand
>>> "/export/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris10-x86-ss12u2//openmpi-1.4.5rc2/opal/include/opal/sys/ia32/atomic.h",
>>> line 170: warning: parameter in inline asm statement unused: %2
>>> "/export/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris10-x86-ss12u2//openmpi-1.4.5rc2/opal/include/opal/sys/ia32/atomic.h",
>>> line 187: warning: impossible constraint for "%1" asm operand
>>> "/export/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris10-x86-ss12u2//openmpi-1.4.5rc2/opal/include/opal/sys/ia32/atomic.h",
>>> line 187: warning: parameter in inline asm statement unused: %2
>> This is annoying, but "make check" passes all tests.
>>
>>
>> -Paul
>>
>>
>

-- 
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.dontje_at_[hidden] <mailto:terry.dontje_at_[hidden]>