Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Building Error
From: Larry Baker (baker_at_[hidden])
Date: 2011-08-16 17:10:09


Matthew,

orte_odls is a global variable defined in odls_base_open.c.

I used your configure options, but did not override the compiler or
compiler flags options. configure used gcc. odls_base_open.c gets
compiled and then the object gets inserted into libmca_odls.a. Later,
it looks like it also gets inserted into libopen-rte.0.dylib. The
link step to create orte-clean references libopen-rte.dylib:

> gcc:
>
> /bin/sh ../../../libtool --tag=CC --mode=link gcc -O3 -DNDEBUG -
> finline-functions -fno-strict-aliasing -fvisibility=hidden -export-
> dynamic -o orte-clean orte-clean.o ../../../orte/libopen-rte.la -
> lutil
>
> libtool: link: gcc -O3 -DNDEBUG -finline-functions -fno-strict-
> aliasing -fvisibility=hidden -o .libs/orte-clean orte-
> clean.o ../../../orte/.libs/libopen-rte.dylib /Users/baker/Desktop/
> Software/OpenMPI/1.4.3/openmpi-1.4.3/opal/.libs/libopen-pal.dylib -
> lutil

Your link step does not; it references a static version of libopen-rte:

> pgcc:
>
> /bin/sh ../../../libtool --tag=CC --mode=link /opt/pgi/
> osx86-64/10.9/bin/pgcc -DNDEBUG -O2 -Msignextend -V -export-
> dynamic -o orte-clean orte-clean.o ../../../orte/libopen-rte.la-
> lutil
>
> libtool: link: /opt/pgi/osx86-64/10.9/bin/pgcc -DNDEBUG -O2 -
> Msignextend -V -o orte-clean orte-clean.o ../../../orte/.libs/
> libopen-rte.a /Users/matt/software/openmpi/openmpi-1.4.3/opal/.libs/
> libopen-pal.a -lutil
>
> Undefined symbols for architecture x86_64:
> "_orte_odls", referenced from:
> _orte_errmgr_base_error_abort in libopen-
> rte.a(errmgr_base_fns.o)
> ld: symbol(s) not found for architecture x86_64

I will try to configure my setup to use static libraries and see what
changes.

I think the experiment with -search_paths_first was a red herring. I
think odls_base_open.o is not in libopen-rte.a for some reason. Or,
the external name that gets defined in odls_base_open.c is not the
same as the external name being referenced in errmgr_base_fns.c.

Larry Baker
US Geological Survey
650-329-5608
baker_at_[hidden]

On 16 Aug 2011, at 11:53 AM, Matthew Russell wrote:

> Hi Larry,
>
> Thank you for your interest.
>
> I believe your solution is the right one, however I think there's
> some other issues causing some problems too.
>
> When I add the search_paths_first flag to my configure, the command
> that breaks in the Makefile is,
>
> libtool: link: /opt/pgi/osx86-64/10.9/bin/pgcc -DNDEBUG -O2 -
> Msignextend -V -search_paths_first -o orte-clean orte-
> clean.o ../../../orte/.libs/libopen-rte.a /Users/matt/software/
> openmpi/openmpi-1.4.3/opal/.libs/libopen-pal.a -lutil
> pgcc-Error-Unknown switch: -search_paths_first
>
> pgcc 10.9-0 64-bit target on Apple OS/X -tp nehalem-64
> Copyright 1989-2000, The Portland Group, Inc. All Rights Reserved.
> Copyright 2000-2010, STMicroelectronics, Inc. All Rights Reserved.
> make: *** [orte-clean] Error 1
>
> The problem there is that that libtool isn't passing the "-Wl,"
> along with the search_path_first error, so it isn't getting to the
> linker. If I try to manually build it, I still have missing symbols:
>
> matt_at_pontus:orte-clean$ pgcc -DNDEBUG -O2 -Msignextend -V -Wl,-
> search_paths_first -o orte-clean orte-clean.o ../../../orte/.libs/
> libopen-rte.a /Users/matt/software/openmpi/openmpi-1.4.3/opal/.libs/
> libopen-pal.a -lutil
>
> pgcc 10.9-0 64-bit target on Apple OS/X -tp nehalem-64
> Copyright 1989-2000, The Portland Group, Inc. All Rights Reserved.
> Copyright 2000-2010, STMicroelectronics, Inc. All Rights Reserved.
> Undefined symbols for architecture x86_64:
> "_orte_odls", referenced from:
> _orte_errmgr_base_error_abort in libopen-
> rte.a(errmgr_base_fns.o)
> ld: symbol(s) not found for architecture x86_64
>
>
>
> On Tue, Aug 16, 2011 at 2:46 PM, Larry Baker <baker_at_[hidden]> wrote:
> Matthew,
>
> What configure options did you use?
>
> I can try to replicate your findings, as best I can, using the Intel
> compiler on my desktop Mac (Leopard). One thing I want to
> investigate is which libutil is supposed to be linked. There is no -
> L in the failing link step. Is that possibly the error?
>
> I have PGI and about five other compilers on our cluster. I'll get
> to OpenMPI 1.4.3 with all those as soon as I fetch the latest
> versions and reinstall my cluster software (Rocks just came out with
> 5.4.3).
>
> Larry Baker
> US Geological Survey
> 650-329-5608
> baker_at_[hidden]
>
> On 16 Aug 2011, at 9:44 AM, Matthew Russell wrote:
>
>> Hmm, I tried the recommendation above, adding -Wl,-
>> search_paths_first, and I still ran into the same issue. I suspect
>> it is an issue with PGI.
>>
>> Meanwhile, I've been able to get my applications (CMAQ) working
>> with MPICH2, so for now at least I am going to continue with that.
>>
>> Thanks for the responses!
>>
>> On Mon, Aug 15, 2011 at 8:43 PM, Ralph Castain <rhc_at_[hidden]>
>> wrote:
>> FWIW: I build OMPI on Mac OS-X (Snow Leopard) every day, without
>> adding any extra flags, without problem. The citation below relates
>> to something from a long time ago, I believe - haven't seen that
>> problem in quite some time.
>>
>> I do not, however, use PGI. We regularly have problems with PGI on
>> a variety of systems, and I suspect you are hitting one here - but
>> can't confirm it as we don't have PGI licenses to use for testing.
>>
>> The Xgrid support is broken, but has nothing to do with the problem
>> you describe. Just means you can't launch via Xgrid.
>>
>>
>>
>> On Aug 15, 2011, at 2:53 PM, Larry Baker wrote:
>>
>>> Matthew,
>>>
>>> I have the same type of error on a completely different software
>>> package on Mac OS X. The error occurs because of the way that Mac
>>> OS X searches for -lutil. If the libutil.a ORTE needs is theirs,
>>> i.e., not the system libutil.dylib, then you have exactly the same
>>> problem I did.
>>>
>>> Here are my notes for the fix using gcc. You will have to find
>>> out the equivalent method to pass the -search_paths_first linker
>>> option using pgcc.
>>>
>>>> # Mac OS X searches for shared libraries before static
>>>> libraries. Thus, -L<ours> -lutil finds the system libutil.dylib
>>>> # before our libutil.a, which causes undefined references in the
>>>> link step because it is using the wrong library. The
>>>> # ld -search_paths_first option forces ld to search each
>>>> directory first for a matching library, instead of all directories
>>>> # first for a shared library.
>>>> # Note: this is the form to pass -search_paths_first to ld when $
>>>> (CC) is the linker command in makefile.ux
>>>> export LDFLAGS=-Wl,-search_paths_first
>>>
>>> Larry Baker
>>> US Geological Survey
>>> 650-329-5608
>>> baker_at_[hidden]
>>>
>>> On 15 Aug 2011, at 1:01 PM, Matthew Russell wrote:
>>>
>>>>
>>>>
>>>> I hope this problem merits being posted here.
>>>>
>>>> On OS X (Snow Leopard, and Lion), I cannot seem to build Open MPI.
>>>>
>>>> After a lot of building, I get the error:
>>>>
>>>> /bin/sh ../../../libtool --tag=CC --mode=link /opt/pgi/
>>>> osx86-64/10.9/bin/pgcc -DNDEBUG -O2 -Msignextend -V -export-
>>>> dynamic -o orte-clean orte-clean.o ../../../orte/libopen-rte.la-
>>>> lutil
>>>> libtool: link: /opt/pgi/osx86-64/10.9/bin/pgcc -DNDEBUG -O2 -
>>>> Msignextend -V -o orte-clean orte-clean.o ../../../orte/.libs/
>>>> libopen-rte.a /Users/matt/software/openmpi/openmpi-1.4.3/
>>>> opal/.libs/libopen-pal.a -lutil
>>>> Undefined symbols for architecture x86_64:
>>>> "_orte_odls", referenced from:
>>>> _orte_errmgr_base_error_abort in libopen-
>>>> rte.a(errmgr_base_fns.o)
>>>> ld: symbol(s) not found for architecture x86_64
>>>>
>>>> This is with the PGI 10.9 compiler, OpenMPI 1.4.3, platform is
>>>> 86x64
>>>>
>>>> The README does not list PGI as a compiler that OpenMPI was
>>>> tested with, and there are notes about it's support for XGrid
>>>> being broken (I'm not sure if this is related.)
>>>>
>>>> I seem to get the error regardless of which configure flags I'm
>>>> using, just for completeness though, here are the flags I am using:
>>>> ./configure --prefix=/usr/local/openmpi_pg --enable-mpi-f77 --
>>>> enable-mpi-f90 --with-memory-manager=none
>>>>
>>>> Has anyone else got or fixed this error?
>>>>
>>>> I looked at other postings in this list, such as http://www.open-mpi.org/community/lists/devel/2007/05/1590.php
>>>> , but they didn't help much.
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>