Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] r27078 and OMPI build
From: Paul Hargrove (phhargrove_at_[hidden])
Date: 2012-08-24 19:55:50


OK, I have a vanilla configure+make running on both SPARC/Solaris-10 and
AMD64/Solaris-11.
I am using the 12.3 Oracle compilers in both cases to match the original
report.
I'll post the results when they complete.

In the meantime, I took a quick look at the code and have a pretty
reasonable guess as to the cause.
Looking at ompi/mca/coll/ml/coll_ml.h I see:

   827 int mca_coll_ml_memsync_intra(mca_coll_ml_module_t *module, int
bank_index);
[...]
   996 static inline __opal_attribute_always_inline__
   997 int
mca_coll_ml_buffer_recycling(mca_coll_ml_collective_operation_progress_t
*ml_request)
   998 {
[...]
  1023 rc = mca_coll_ml_memsync_intra(ml_module,
ml_memblock->memsync_counter);
[...]
  1041 }

Based on past experience w/ the Sun/Oracle compilers on another project
(See http://bugzilla.hcs.ufl.edu/cgi-bin/bugzilla3/show_bug.cgi?id=193 ), I
suspect that this static-inline-always function is being emitted by the
compiler in every object which includes this header even if they don't call
it.. The call on line 1023 then results in the undefined reference
to mca_coll_ml_memsync_intra. Basically it is not safe for an inline
function in a header to call an extern function that isn't available to
every object that includes the header REGARDLESS of whether the object
invokes the inline function or not.

-Paul

On Fri, Aug 24, 2012 at 4:40 PM, Ralph Castain <rhc_at_[hidden]> wrote:

> Oracle uses an abysmally complicated configure line, but nearly all of it
> is irrelevant to the problem here. For this, I would suggest just doing a
> vanilla ./configure - if the component gets pulled into libmpi, then we
> know there is a problem.
>
> Thanks!
>
> Just FYI: here is there actual configure line, just in case you spot
> something problematic:
>
> CC=cc CXX=CC F77=f77 FC=f90 --with-openib --enable-openib-connectx-xrc --without-udapl
> --disable-openib-ibcm --enable-btl-openib-failover --without-dtrace --enable-heterogeneous
> --enable-cxx-exceptions --enable-shared --enable-orterun-prefix-by-default --with-sge
> --enable-mpi-f90 --with-mpi-f90-size=small --disable-peruse --disable-state
> --disable-mpi-thread-multiple --disable-debug --disable-mem-debug --disable-mem-profile
> CFLAGS="-xtarget=ultra3 -m32 -xarch=sparcvis2 -xprefetch -xprefetch_level=2 -xvector=lib -Qoption
> cg -xregs=no%appl -xdepend=yes -xbuiltin=%all -xO5" CXXFLAGS="-xtarget=ultra3 -m32
> -xarch=sparcvis2 -xprefetch -xprefetch_level=2 -xvector=lib -Qoption cg -xregs=no%appl -xdepend=yes
> -xbuiltin=%all -xO5 -Bstatic -lCrun -lCstd -Bdynamic" FFLAGS="-xtarget=ultra3 -m32 -xarch=sparcvis2
> -xprefetch -xprefetch_level=2 -xvector=lib -Qoption cg -xregs=no%appl -stackvar -xO5"
> FCFLAGS="-xtarget=ultra3 -m32 -xarch=sparcvis2 -xprefetch -xprefetch_level=2 -xvector=lib -Qoption
> cg -xregs=no%appl -stackvar -xO5"
> --prefix=/workspace/euloh/hpc/mtt-scratch/burl-ct-t2k-3/ompi-tarball-testing/installs/JA08/install
> --mandir=${prefix}/man --bindir=${prefix}/bin --libdir=${prefix}/lib
> --includedir=${prefix}/include --with-tm=/ws/ompi-tools/orte/torque/current/shared-install32
> --enable-contrib-no-build=vt --with-package-string="Oracle Message Passing Toolkit "
> --with-ident-string="@(#)RELEASE VERSION 1.9openmpi-1.5.4-r1.9a1r27092"
>
>
> and the error he gets is:
>
> make[2]: Entering directory
> `/workspace/euloh/hpc/mtt-scratch/burl-ct-t2k-3/ompi-tarball-testing/mpi-install/s3rI/src/openmpi-1.9a1r27092/ompi/tools/ompi_info'
> CCLD ompi_info
> Undefined first referenced
> symbol in file
> mca_coll_ml_memsync_intra ../../../ompi/.libs/libmpi.so
> ld: fatal: symbol referencing errors. No output written to .libs/ompi_info
> make[2]: *** [ompi_info] Error 2
> make[2]: Leaving directory
> `/workspace/euloh/hpc/mtt-scratch/burl-ct-t2k-3/ompi-tarball-testing/mpi-install/s3rI/src/openmpi-1.9a1r27092/ompi/tools/ompi_info'
> make[1]: *** [install-recursive] Error 1
> make[1]: Leaving directory
> `/workspace/euloh/hpc/mtt-scratch/burl-ct-t2k-3/ompi-tarball-testing/mpi-install/s3rI/src/openmpi-1.9a1r27092/ompi'
> make: *** [install-recursive] Error 1
>
>
> On Aug 24, 2012, at 4:30 PM, Paul Hargrove <phhargrove_at_[hidden]> wrote:
>
> I have access to a few different Solaris machines and can offer to build
> the trunk if somebody tells me what configure flags are desired.
>
> -Paul
>
> On Fri, Aug 24, 2012 at 8:54 AM, Ralph Castain <rhc_at_[hidden]> wrote:
>
>> Eugene - can you confirm that this is only happening on the one Solaris
>> system? In other words, is this a general issue or something specific to
>> that one machine?
>>
>> I'm wondering because if it is just the one machine, then it might be
>> something strange about how it is setup - perhaps the version of Solaris,
>> or it is configuring --enable-static, or...
>>
>> Just trying to assess how general a problem this might be, and thus if
>> this should be a blocker or not.
>>
>> On Aug 24, 2012, at 8:00 AM, Eugene Loh <eugene.loh_at_[hidden]> wrote:
>>
>> > On 08/24/12 09:54, Shamis, Pavel wrote:
>> >> Maybe there is a chance to get direct access to this system ?
>> > No.
>> >
>> > But I'm attaching compressed log files from configure/make.
>> >
>> >
>> <tarball-of-log-files.tar.bz2>_______________________________________________
>> > devel mailing list
>> > devel_at_[hidden]
>> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>
>
>
> --
> Paul H. Hargrove PHHargrove_at_[hidden]
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

-- 
Paul H. Hargrove                          PHHargrove_at_[hidden]
Future Technologies Group
Computer and Data Sciences Department     Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900