On 1/29/2012 7:40 PM, Paul Hargrove wrote:
I can additionally report success w/ ILP32 builds with both SS12.2 and 12.3 compilers on x86-64 and sun4v systems running Solaris and x86-64/Linux:
   solaris-10 Generic_137111-07/sun4v (*FLAGS="-m32 -xarch=sparc" for v8plus ABI)
   solaris-11 snv_151a/amd64 [incl. ofud, openib and dapl]  (*FLAGS=-m32)
   linux/x86-64 (*FLAGS=-m32)

On Linux I had to "LD_LIBRARY_PATH=:/lib32:/usr/lib32", but that seems to be an Solaris Studio issue, rather than an OMPI or libtool one.  That was NOT necessary to get a ILP32 using GCC.

This sounds like more a runpath (mis)setting to me.  Can you send me your config.log and a copy of your make output?  Did you run into the same issue with -m64?
My sun4u (single-CPU UltraSparcIII) system is just too painfully slow to test yet again.

I'd imagine so :-).

Thanks,

--td
-Paul

On Wed, Jan 25, 2012 at 11:49 PM, Paul H. Hargrove <PHHargrove@lbl.gov> wrote:
I am pleased to report that w/ help from Terry I can now build nearly everything w/ the Solaris Studio 12.2 and 12.3 compilers.
Upon comparing our build environments we discovered that CXX=CC works but CXX=sunCC does not, even though they are both symlinks to the same compiler executable.  I still don't know the root cause (though libtool and associated configure logic is still the obvious suspect), but the work around is simple:
   When using the Solaris Studio compilers on Solaris, one must set CXX=CC rather than  CXX=sunCC.

So I am following that advice, and have additionally:
+ gotten  up-to-date patches applied to resolve my FORTRAN and OMP issues on the SPARC-T2 system.
+ installed both 12.2 and 12.3 compilers on Linux/x86-64

So, I can now report the following ALL work (defined as "make all check install") with BOTH 12.2 and 12.3 Solaris Studio compilers.
The only configure flags are --prefix, setting the CC, CXX, F77 and FC variables, and (when appropriate) setting *FLAGS=-m64.
   solaris-10 s10_69/sun4u (w/ -m64)
   solaris-10 Generic_137111-07/sun4v (w/ -m64)
   solaris-11 snv_151a/amd64 [including ofud, openib and dapl] (w/ -m64)
   linux/x86-64 (no -m64 needed because it is the default)

The following works w/ the 12.2 compilers:
   solaris-10 Generic_142901-03/i386
However, the f77/f90 compilers in 12.3 are generating code using SSE2 instructions even when passed -xarch=pentium_pro.
So this machine cannot run the resulting executables.  So, I had to --disable-mpi-f77 to get things to work.
That, however, is NOT an OMPI problem.

-Paul

On 1/19/2012 11:21 PM, Paul H. Hargrove wrote:
As promised earlier today, here are results from my Solaris platforms.
Note that there are libtool-related failures below that may be worth pursuing.
If necessary, access to most of my machines can be arranged for qualified persons.

== GNU compilers with {C,CXX,F77,FC}FLAGS=-mcpu=v9 on SPARCs, and -m64 on amd64

PASS:
   solaris-10 s10_69/sun4u (w/ g77, no FC)
   solaris-10 Generic_142901-03/i386 (w/ Sun's f77 and f95, both dated April 2009)
   solaris-11 snv_151a/amd64 [including ofud, openib and dapl] (w/ g77, no FC)

FAIL:
   solaris-10 Generic_137111-07/sun4v with default GNU compilers
Using system default gcc, which is actually Sun's gccfss-4.0.4, there are assertion failures seen in the atomics in "make check".  I can provide details is anybody cares, but I know from past experience that support for gcc-style inline asm is marginal in this compiler.

== Sun Studio 12.2 compilers w/ {C,CXX,F77,FC}=-m64 on SPARCs and amd64

Both of my SPARC systems appear to have an out-of-date libmtsk.so, which both prevents the Sun f77 and f90 compilers from running at all, and additionally leads to failure like the following when building OpenMP support in VT:
/bin/bash ../../libtool --tag=CXX    --mode=link sunCC -xopenmp -DVT_OMP  -m64 -xopenmp  -o vtfilter vtfilter-vt_filter.o  vtfilter-vt_filthandler.o  vtfilter-vt_otfhandler.o  vtfilter-vt_tracefilter.o ../../util/util.o  -L../../extlib/otf/otflib -L../../extlib/otf/otflib/.libs -lotf  -lz -lsocket -lnsl  -lrt -lm -lthread
libtool: link: sunCC -xopenmp -DVT_OMP -m64 -xopenmp -o vtfilter vtfilter-vt_filter.o vtfilter-vt_filthandler.o vtfilter-vt_otfhandler.o vtfilter-vt_tracefilter.o ../../util/util.o  -L/home/hargrove/OMPI/openmpi-1.4.5rc2-solaris10-sparcT2-ss12u2/BLD/ompi/contrib/vt/vt/extlib/otf/otflib/.libs -L/home/hargrove/OMPI/openmpi-1.4.5rc2-solaris10-sparcT2-ss12u2/BLD/ompi/contrib/vt/vt/extlib/otf/otflib /home/hargrove/OMPI/openmpi-1.4.5rc2-solaris10-sparcT2-ss12u2/BLD/ompi/contrib/vt/vt/extlib/otf/otflib/.libs/libotf.a -lz -lsocket -lnsl -lrt -lm -lthread
CC: Warning: Optimizer level changed from 0 to 3 to support parallelized code.
Undefined                       first referenced
 symbol                             in file
__mt_MasterFunction_cxt_            vtfilter-vt_tracefilter.o
ld: fatal: Symbol referencing errors. No output written to vtfilter
*** Error code 2
This is a lack of required Solaris patches and NOT an ompi or vt problem to be solved.
However, as a result my two SPARC platforms are configured with
  --disable-mpi-f77 --disable-mpi-f90 --with-contrib-vt-flags="--disable-omp --disable-hyb"
[It took a bit of work to figure out how disable OMP and not just VT in its entirety.]
I report this info just to note that my SPARC testing is "narrower" than on my x86 and amd64 machines.

The one "real" problem I found appears to be libtool related and impacted all 4 platforms:
   solaris-10 s10_69/sun4u
   solaris-10 Generic_142901-03/i386
   solaris-11 snv_151a/amd64 [including ofud, openib and dapl]
   solaris-10 Generic_137111-07/sun4v
No problem with "make all" or with "make check", but "make install" fails with:
Making install in mpi/cxx
make[2]: Entering directory `/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/BLD/ompi/mpi/cxx'
make[3]: Entering directory `/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/BLD/ompi/mpi/cxx'
test -z "/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/INST/lib" || /usr/gnu/bin/mkdir -p "/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/INST/lib"
 /bin/sh ../../../libtool   --mode=install /usr/bin/ginstall -c  'libmpi_cxx.la' '/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/INST/lib/libmpi_cxx.la'
libtool: install: warning: relinking `libmpi_cxx.la'
libtool: install: (cd /home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/BLD/ompi/mpi/cxx; /bin/sh /home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/BLD/libtool  --tag CXX --mode=relink sunCC -O -DNDEBUG -m64 -version-info 0:1:0 -export-dynamic -o libmpi_cxx.la -rpath /home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/INST/lib mpicxx.lo intercepts.lo comm.lo datatype.lo win.lo file.lo ../../../ompi/libmpi.la -lsocket -lnsl -lm -lthread )
mv: cannot stat `libmpi_cxx.so.0.0.1': No such file or directory
libtool: install: error: relink `libmpi_cxx.la' with the above command before installing it
make[3]: *** [install-libLTLIBRARIES] Error 1
make[3]: Leaving directory `/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/BLD/ompi/mpi/cxx'
make[2]: *** [install-am] Error 2
make[2]: Leaving directory `/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/BLD/ompi/mpi/cxx'
make[1]: *** [install-recursive] Error 1
make[1]: Leaving directory `/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-suncc/BLD/ompi'
make: *** [install-recursive] Error 1
No such problem was seen w/ the GNU compilers on the same 4 systems.
This looks to be a libtool bug in support for the Sun C++ compiler, especially since configuring with "--enable-static --disable-shared" eliminates this problem (but is undesirable for obvious reasons).
A quick peek appears to show some dangling symlinks:
$ ls -l ompi/mpi/cxx/.libs/
total 869
-rw-r--r-- 1 phargrov staff 116944 2012-01-19 18:09 comm.o
-rw-r--r-- 1 phargrov staff  41644 2012-01-19 18:09 datatype.o
-rw-r--r-- 1 phargrov staff  17240 2012-01-19 18:09 file.o
-rw-r--r-- 1 phargrov staff 222592 2012-01-19 18:09 intercepts.o
lrwxrwxrwx 1 phargrov staff     16 2012-01-19 18:09 libmpi_cxx.la -> ../libmpi_cxx.la
-rw-r--r-- 1 phargrov staff   1262 2012-01-19 18:09 libmpi_cxx.lai
lrwxrwxrwx 1 phargrov staff     19 2012-01-19 18:09 libmpi_cxx.so -> libmpi_cxx.so.0.0.1
lrwxrwxrwx 1 phargrov staff     19 2012-01-19 18:09 libmpi_cxx.so.0 -> libmpi_cxx.so.0.0.1
-rw-r--r-- 1 phargrov staff 267364 2012-01-19 18:09 mpicxx.o
-rw-r--r-- 1 phargrov staff  46660 2012-01-19 18:09 win.o
It is possible that the library build failed in "make all" w/o terminating the build (I've seen such things before).
The initial evidence in the "make all" output does suggest no shared lib was built:
/bin/sh ../../../libtool --tag=CXX   --mode=link sunCC  -O -DNDEBUG -m64  -version-info 0:1:0 -export-dynamic   -o libmpi_cxx.la -rpath /home/phargrov/OMPI/openmpi-1.4.5rc2-solaris11-x64-ib-ss12u2/INST/lib mpicxx.lo intercepts.lo comm.lo datatype.lo win.lo file.lo ../../../ompi/libmpi.la -lsocket -lnsl  -lm -lthread
libtool: link: (cd ".libs" && rm -f "libmpi_cxx.so.0" && ln -s "libmpi_cxx.so.0.0.1" "libmpi_cxx.so.0")
libtool: link: (cd ".libs" && rm -f "libmpi_cxx.so" && ln -s "libmpi_cxx.so.0.0.1" "libmpi_cxx.so")
libtool: link: ( cd ".libs" && rm -f "libmpi_cxx.la" && ln -s "../libmpi_cxx.la" "libmpi_cxx.la" )
Note the lack of any suncc or sunCC command beween the libtool command line and the "rm && ln" commands.
Inspecting the configure-generated libtool confirms what looks like improper config for tag=CXX:
$ grep -e "BEGIN LIBTOOL TAG CONFIG: [FC]" -e ^archive_cmds libtool
archive_cmds="\$CC -G\${allow_undefined_flag} -h \$soname -o \$lib \$libobjs \$deplibs \$compiler_flags"
# ### BEGIN LIBTOOL TAG CONFIG: CXX
archive_cmds=""
# ### BEGIN LIBTOOL TAG CONFIG: F77
archive_cmds="\$CC -G\${allow_undefined_flag} -h \$soname -o \$lib \$libobjs \$deplibs \$compiler_flags"
# ### BEGIN LIBTOOL TAG CONFIG: FC
archive_cmds="\$CC -G\${allow_undefined_flag} -h \$soname -o \$lib \$libobjs \$deplibs \$compiler_flags"
I'll be happy to provide all or part of config.log to Ralf W. or other interested persons to debug this.

I suppose I could have tried w/o C++ bindings instead of disabling libtool, but with F77 and F90 bindings already disabled on the SPARCs that didn't feel to me like a very good use of my time.


An additional annoyance on one platform:
   solaris-10 Generic_142901-03/i386
Is additionally unhappy with the atomics, yielding the following warnings for every file that include atomic.h:
"/export/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris10-x86-ss12u2//openmpi-1.4.5rc2/opal/include/opal/sys/ia32/atomic.h", line 170: warning: impossible constraint for "%1" asm operand
"/export/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris10-x86-ss12u2//openmpi-1.4.5rc2/opal/include/opal/sys/ia32/atomic.h", line 170: warning: parameter in inline asm statement unused: %2
"/export/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris10-x86-ss12u2//openmpi-1.4.5rc2/opal/include/opal/sys/ia32/atomic.h", line 187: warning: impossible constraint for "%1" asm operand
"/export/home/phargrov/OMPI/openmpi-1.4.5rc2-solaris10-x86-ss12u2//openmpi-1.4.5rc2/opal/include/opal/sys/ia32/atomic.h", line 187: warning: parameter in inline asm statement unused: %2
This is annoying, but "make check" passes all tests.


-Paul



--
Paul H. Hargrove                          PHHargrove@lbl.gov
Future Technologies Group
HPC Research Department                   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900

_______________________________________________
devel mailing list
devel@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Paul H. Hargrove                          PHHargrove@lbl.gov
Future Technologies Group
HPC Research Department                   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900


_______________________________________________
devel mailing list
devel@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle - Performance Technologies
95 Network Drive, Burlington, MA 01803
Email terry.dontje@oracle.com