Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Building Error
From: Matthew Russell (mrussel2_at_[hidden])
Date: 2011-08-17 13:01:34


Hi, I'm really grateful for the detailed responses.

I'll try running different responses as Larry suggested. Right now MPICH
seems to be satisfying my needs, so I have less time to devote to getting
OpenMPI working, but I am interested in having it working just as an option
to MPICH.

Thanks!

On Tue, Aug 16, 2011 at 10:35 PM, Ralph Castain <rhc_at_[hidden]> wrote:

> Just an FYI. Disabling ORTE support is intended solely for systems that
> require no RTE assistance - e.g., Crays. Configuring without RTE support
> will generate something that cannot run on a Mac, which is why the build
> fails in that environment - it is looking for external RTE support that does
> not exist on the Mac. That configure option works fine on the intended
> targets.
>
> The declspec macro does indeed have visibility attributes - in fact, that
> is its sole purpose. You are welcome to try disabling visibility to see if
> that helps.
>
> The module definitions are actually identical, minus the visibility flags.
>
>
> On Aug 16, 2011, at 8:08 PM, Larry Baker wrote:
>
> Matthew,
>
> The best I can come up with is that somehow the declaration of
> external orte_odls in orte/mca/odls/odls.h
>
> ORTE_DECLSPEC extern orte_odls_base_module_t orte_odls; /* holds selected
> module's function pointers */
>
>
> does not exactly match the definition of orte_odis in
> orte/mca/odis/base/odls_base_open.c
>
> orte_odls_base_module_t orte_odls;
>
>
> ORTE_DECLSPEC might include some decorations having to do with the
> visibility attribute. Try adding --disable-visibility to your configure.
>
> Otherwise, I see in orte/mca/odis/base/odls_base_open.c that orte_odis is
> not defined if ORTE_DISABLE_FULL_SUPPORT == 1. I tried to compile
> with --without-rte-support to force #define ORTE_DISABLE_FULL_SUPPORT 1, but
> the make failed before it reached the link that failed for you. When
> --without-rte-support is requested in 1.4.3, there are declarations that
> depend on typedefs that are skipped, causing the make to fail. You may be
> encountering something subtle like that when configure deduces some behavior
> for pgcc and the code doesn't quite have the conditional compilation tests
> in the right place.
>
> You might try a newer version of OpenMPI, which might have fixed problem
> like --without-rte-support failing.
>
> Larry Baker
> US Geological Survey
> 650-329-5608
> baker_at_[hidden]
>
> On 16 Aug 2011, at 11:53 AM, Matthew Russell wrote:
>
> Hi Larry,
>
> Thank you for your interest.
>
> I believe your solution is the right one, however I think there's some
> other issues causing some problems too.
>
> When I add the search_paths_first flag to my configure, the command that
> breaks in the Makefile is,
>
> libtool: link: /opt/pgi/osx86-64/10.9/bin/pgcc -DNDEBUG -O2 -Msignextend -V
> -search_paths_first -o orte-clean orte-clean.o
> ../../../orte/.libs/libopen-rte.a
> /Users/matt/software/openmpi/openmpi-1.4.3/opal/.libs/libopen-pal.a -lutil
> *pgcc-Error-Unknown switch: -search_paths_first*
>
> pgcc 10.9-0 64-bit target on Apple OS/X -tp nehalem-64
> Copyright 1989-2000, The Portland Group, Inc. All Rights Reserved.
> Copyright 2000-2010, STMicroelectronics, Inc. All Rights Reserved.
> make: *** [orte-clean] Error 1
>
> The problem there is that that libtool isn't passing the "-Wl," along with
> the search_path_first error, so it isn't getting to the linker. If I try
> to manually build it, I still have missing symbols:
>
> matt_at_pontus:orte-clean$ pgcc -DNDEBUG -O2 -Msignextend -V *
> -Wl,-search_paths_first* -o orte-clean orte-clean.o
> ../../../orte/.libs/libopen-rte.a
> /Users/matt/software/openmpi/openmpi-1.4.3/opal/.libs/libopen-pal.a -lutil
>
> pgcc 10.9-0 64-bit target on Apple OS/X -tp nehalem-64
> Copyright 1989-2000, The Portland Group, Inc. All Rights Reserved.
> Copyright 2000-2010, STMicroelectronics, Inc. All Rights Reserved.
> Undefined symbols for architecture x86_64:
> "_orte_odls", referenced from:
> _orte_errmgr_base_error_abort in libopen-rte.a(errmgr_base_fns.o)
> ld: symbol(s) not found for architecture x86_64
>
>
>
> On Tue, Aug 16, 2011 at 2:46 PM, Larry Baker <baker_at_[hidden]> wrote:
>
>> Matthew,
>>
>> What configure options did you use?
>>
>> I can try to replicate your findings, as best I can, using the Intel
>> compiler on my desktop Mac (Leopard). One thing I want to investigate is
>> which libutil is supposed to be linked. There is no -L in the failing link
>> step. Is that possibly the error?
>>
>> I have PGI and about five other compilers on our cluster. I'll get to
>> OpenMPI 1.4.3 with all those as soon as I fetch the latest versions and
>> reinstall my cluster software (Rocks just came out with 5.4.3).
>>
>> Larry Baker
>> US Geological Survey
>> 650-329-5608
>> baker_at_[hidden]
>>
>> On 16 Aug 2011, at 9:44 AM, Matthew Russell wrote:
>>
>> Hmm, I tried the recommendation above, adding -Wl,-search_paths_first, and
>> I still ran into the same issue. I suspect it is an issue with PGI.
>>
>> Meanwhile, I've been able to get my applications (CMAQ) working with
>> MPICH2, so for now at least I am going to continue with that.
>>
>> Thanks for the responses!
>>
>> On Mon, Aug 15, 2011 at 8:43 PM, Ralph Castain <rhc_at_[hidden]> wrote:
>>
>>> FWIW: I build OMPI on Mac OS-X (Snow Leopard) every day, without adding
>>> any extra flags, without problem. The citation below relates to something
>>> from a long time ago, I believe - haven't seen that problem in quite some
>>> time.
>>>
>>> I do not, however, use PGI. We regularly have problems with PGI on a
>>> variety of systems, and I suspect you are hitting one here - but can't
>>> confirm it as we don't have PGI licenses to use for testing.
>>>
>>> The Xgrid support is broken, but has nothing to do with the problem you
>>> describe. Just means you can't launch via Xgrid.
>>>
>>>
>>>
>>> On Aug 15, 2011, at 2:53 PM, Larry Baker wrote:
>>>
>>> Matthew,
>>>
>>> I have the same type of error on a completely different software package
>>> on Mac OS X. The error occurs because of the way that Mac OS X searches for
>>> -lutil. If the libutil.a ORTE needs is theirs, i.e., not the system
>>> libutil.dylib, then you have exactly the same problem I did.
>>>
>>> Here are my notes for the fix using gcc. You will have to find out the
>>> equivalent method to pass the -search_paths_first linker option using pgcc.
>>>
>>> # Mac OS X searches for shared libraries before static libraries. Thus,
>>> -L<ours> -lutil finds the system libutil.dylib
>>> # before our libutil.a, which causes undefined references in the link
>>> step because it is using the wrong library. The
>>> # ld -search_paths_first option forces ld to search each directory first
>>> for a matching library, instead of all directories
>>> # first for a shared library.
>>> # Note: this is the form to pass -search_paths_first to ld when $(CC) is
>>> the linker command in makefile.ux
>>> export LDFLAGS=-Wl,-search_paths_first
>>>
>>>
>>> Larry Baker
>>> US Geological Survey
>>> 650-329-5608
>>> baker_at_[hidden]
>>>
>>> On 15 Aug 2011, at 1:01 PM, Matthew Russell wrote:
>>>
>>>
>>>
>>> I hope this problem merits being posted here.
>>>
>>> On OS X (Snow Leopard, and Lion), I cannot seem to build Open MPI.
>>>
>>> After a lot of building, I get the error:
>>>
>>> /bin/sh ../../../libtool --tag=CC --mode=link
>>> /opt/pgi/osx86-64/10.9/bin/pgcc -DNDEBUG -O2 -Msignextend -V
>>> -export-dynamic -o orte-clean orte-clean.o
>>> ../../../orte/libopen-rte.la-lutil
>>> libtool: link: /opt/pgi/osx86-64/10.9/bin/pgcc -DNDEBUG -O2 -Msignextend
>>> -V -o orte-clean orte-clean.o ../../../orte/.libs/libopen-rte.a
>>> /Users/matt/software/openmpi/openmpi-1.4.3/opal/.libs/libopen-pal.a -lutil
>>> Undefined symbols for architecture x86_64:
>>> "_orte_odls", referenced from:
>>> _orte_errmgr_base_error_abort in libopen-rte.a(errmgr_base_fns.o)
>>> ld: symbol(s) not found for architecture x86_64
>>>
>>> This is with the PGI 10.9 compiler, OpenMPI 1.4.3, platform is 86x64
>>>
>>> The README does not list PGI as a compiler that OpenMPI was tested with,
>>> and there are notes about it's support for XGrid being broken (I'm not sure
>>> if this is related.)
>>>
>>> I seem to get the error regardless of which configure flags I'm using,
>>> just for completeness though, here are the flags I am using:
>>> ./configure --prefix=/usr/local/openmpi_pg --enable-mpi-f77
>>> --enable-mpi-f90 --with-memory-manager=none
>>>
>>> Has anyone else got or fixed this error?
>>>
>>> I looked at other postings in this list, such as
>>> http://www.open-mpi.org/community/lists/devel/2007/05/1590.php , but
>>> they didn't help much.
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>>
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>