Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] undefined references for rdma_get_peer_addr & rdma_get_local_addr
From: Pak Lui (Pak.Lui_at_[hidden])
Date: 2008-05-04 12:23:32


Hmm, so it's either setting up a totally new workspace or replacing with
OMPI_LINK_IFELSE would get me the right configure check. I think the
latter is the fix to my problem. I assume make all should work now
unless I'll tell you otherwise...

   48773 configure:123602: checking for rdma_get_peer_addr
   48774 configure:123627: pgcc -o conftest -g -D_REENTRANT
-I/opt/ofed/includ e -L/opt/ofed/lib64 conftest.c -lnsl -lutil
  -lpthread -libverbs >&5
   48775 conftest.c:
   48776 PGC-W-0155-Pointer value created from a nonlong integral type
(conftest .c: 423)
   48777 PGC/x86-64 Linux 7.1-2: compilation completed with warnings
   48778 conftest.o: In function `main':
   48779
/share/home/00951/paklui/ompi-trunk5/config-data2-debug/conftest.c:423:
         undefined reference to `rdma_get_peer_addr'
   48780 configure:123633: $? = 2
   48781 configure: failed program was:
   48782 | /* confdefs.h. */
   48783 | #define PACKAGE_NAME "Open MPI"
...
   49196 | #define HAVE_STRUCT_IBV_DEVICE_TRANSPORT_TYPE 1
   49197 | #define HAVE_RDMA_RDMA_CMA_H 1
   49198 | /* end confdefs.h. */
   49199 | #include "rdma/rdma_cma.h"
   49200 |
   49201 | int
   49202 | main ()
   49203 | {
   49204 | void *ret = (void*) rdma_get_peer_addr((struct rdma_cm_id*)0);
   49205 | ;
   49206 | return 0;
   49207 | }
   49208 configure:123650: result: no

Pak Lui wrote:
> For sanity sake I also checked the LD_LIBRARY_PATH, doesn't seem to be
> anything suspicious there either...
>
> login3% echo $LD_LIBRARY_PATH
> /opt/apps/pgi/7.1/linux86-64/7.1-2/libso:/opt/gsi-openssh-4.1/lib:/opt/gsi-openssh-4.1/lib:/opt/apps/binutils-amd/070220/lib64
>
> I am trying Jeff's suggestion to replace OMPI_COMPILE_IFELSE to
> OMPI_LINK_IFELSE. Will let you know.
>
> Pak Lui wrote:
>> Jeff Squyres wrote:
>>> Jon / Steve -- can you comment?
>>>
>>> I tested with OFED 1.2.5 (which is what I assume you meant) and got:
>>>
>>> checking for rdma_get_peer_addr... no
>>>
>>> Because that function is not defined in OFED 1.2.5. Running with OFED
>>> 1.3 (where the function does exist), I get:
>>>
>>> checking for rdma_get_peer_addr... yes
>> For me it seems to be running with 1.2.5.
>>
>> login3% /opt/ofed/bin/ofed_info | head -1
>> OFED-1.2.5.5
>>
>> No rmda_get_peer_addr or rmda_get_local_addr in these .so's, assumingly
>> they are coming from there.
>>
>> login3% ls librdmacm.so*
>> librdmacm.so librdmacm.so.1 librdmacm.so.1.0.0 librdmacm.so.1.0.2
>>
>> login3% nm librdmacm.so* | grep rdma_get_
>> 0000000000003470 T rdma_get_cm_event
>> 0000000000001a20 T rdma_get_devices
>> 0000000000003470 T rdma_get_cm_event
>> 0000000000001a20 T rdma_get_devices
>> 0000000000003470 T rdma_get_cm_event
>> 0000000000001a20 T rdma_get_devices
>> 0000000000003470 T rdma_get_cm_event
>> 0000000000001a20 T rdma_get_devices
>>
>> And I don't see rdma_get_peer_addr appeared in the
>> /opt/ofed/include/rdma/rdma_cma.h either. Not knowing how it actually
>> know about the interface (and it's not inline) there.
>>
>>> Outside of all the configure complexity, can you write a simple
>>> program that calls that function and have it compile and link properly?
>> These are the references of rmda_get_peer_addr from the config.log:
>> 47858 configure:120941: checking for rdma_get_peer_addr
>> 47859 configure:120966: pgcc -c -g -D_REENTRANT
>> -I/opt/ofed/include conftest.c >&5
>> 47860 PGC-W-0155-Pointer value created from a nonlong integral type
>> (conftest .c: 412)
>> 47861 PGC/x86-64 Linux 7.1-2: compilation completed with warnings
>> 47862 configure:120972: $? = 0
>> 47863 configure:120987: result: yes
>> ...
>> 48355 configure:123600: checking for rdma_get_peer_addr
>> 48356 configure:123625: pgcc -c -g -D_REENTRANT
>> -I/opt/ofed/include conftes t.c >&5
>> 48357 PGC-W-0155-Pointer value created from a nonlong integral type
>> (conftest .c: 423)
>> 48358 PGC/x86-64 Linux 7.1-2: compilation completed with warnings
>> 48359 configure:123631: $? = 0
>> 48360 configure:123646: result: yes
>>
>> Here's my program, not sure if it's doing it correctly. I am no m4
>> expert, so how do I run the ompi_check_openib.m4 independently and see
>> the conftest.c??
>>
>> login3% cat mytest.c
>> #include "rdma/rdma_cma.h"
>> int main (void) {
>> void *ret = (void*) rdma_get_peer_addr((struct rdma_cm_id*)0);
>> return 0;
>> }
>>
>> It gives me a warning if I just try to create an object, which is what I
>> see in the config.log.
>>
>> login3% pgcc -c -g -D_REENTRANT -I/opt/ofed/include mytest.c
>> PGC-W-0155-Pointer value created from a nonlong integral type (mytest.c: 3)
>> PGC/x86-64 Linux 7.1-2: compilation completed with warnings
>> login3% echo $?
>> 0
>>
>> But trying to create an executable would give me the error.
>>
>> login3% pgcc -g -D_REENTRANT -I/opt/ofed/include mytest.c -o mytest
>> PGC-W-0155-Pointer value created from a nonlong integral type (mytest.c: 3)
>> PGC/x86-64 Linux 7.1-2: compilation completed with warnings
>> /tmp/pgccjF6BryhFmWS.o: In function `main':
>> /share/home/00951/paklui/ompi-trunk5/config-data1-debug/mytest.c:3:
>> undefined reference to `rdma_get_peer_addr'
>>
>> Hmm, any clues, comments?
>>
>>> I suppose we could change the AC_COMPILE_IFELSE in config/
>>> ompi_check_openib.m4 to OMPI_LINK_IFELSE, but I'm a little confused as
>>> to why it would compile successfully if the symbol rdma_get_peer_addr
>>> is not declared anywhere (which it shouldn't be in OFED 1.2 or 1.2.5,
>>> AFAIK)...
>>>
>>>
>>>
>>> On May 3, 2008, at 10:56 AM, Pak Lui wrote:
>>>
>>>> Sure Jeff, see attached.
>>>>
>>>> Jeff Squyres wrote:
>>>>> (moving to devel so that others are aware)
>>>>> Crud. Can you send me your config.log? I don't know why it's able
>>>>> to find rdma_get_peer_addr() in configure, but then later not able
>>>>> to find it during the build - I'd like to see what happened
>>>>> during configure.
>>>>> On May 2, 2008, at 7:09 PM, Pak Lui wrote:
>>>>>> Hi Jeff,
>>>>>>
>>>>>> It seems that the cpc3 merge causes my Ranger build to break. I
>>>>>> believe it is using OFED 1.2 but I don't know how to check. It
>>>>>> passes the ompi_check_openib.m4 that you added in for the
>>>>>> rdma_get_peer_addr. Is there a missing #include for openib/ofed
>>>>>> related somewhere?
>>>>>>
>>>>>>
>>>>>> 1236 checking rdma/rdma_cma.h usability... yes
>>>>>> 1237 checking rdma/rdma_cma.h presence... yes
>>>>>> 1238 checking for rdma/rdma_cma.h... yes
>>>>>> 1239 checking for rdma_create_id in -lrdmacm... yes
>>>>>> 1240 checking for rdma_get_peer_addr... yes
>>>>>>
>>>>>>
>>>>>> pgCC -DHAVE_CONFIG_H -I. -I../../../../ompi/tools/ompi_info -
>>>>>> I../../../opal/include -I../../../orte/include -I../../../ompi/
>>>>>> include -I../../../opal/mca/paffinity/linux/plpa/src/libplpa -
>>>>>> DOMPI_CONFIGURE_USER="\"paklui\"" -
>>>>>> DOMPI_CONFIGURE_HOST="\"login4.ranger.tacc.utexas.edu\"" -
>>>>>> DOMPI_CONFIGURE_DATE="\"Fri May 2 17:07:01 CDT 2008\"" -
>>>>>> DOMPI_BUILD_USER="\"$USER\"" -DOMPI_BUILD_HOST="\"`hostname`\"" -
>>>>>> DOMPI_BUILD_DATE="\"`date`\"" -DOMPI_BUILD_CFLAGS="\"-O -DNDEBUG
>>>>>> \"" -DOMPI_BUILD_CPPFLAGS="\"-I../../../.. -I../../.. -
>>>>>> I../../../../ opal/include -I../../../../orte/include -
>>>>>> I../../../../ompi/include - D_REENTRANT\"" -
>>>>>> DOMPI_BUILD_CXXFLAGS="\"-O -DNDEBUG \"" -
>>>>>> DOMPI_BUILD_CXXCPPFLAGS="\"-I../../../.. -I../../.. -I../../../../
>>>>>> opal/include -I../../../../orte/include -I../../../../ompi/
>>>>>> include - D_REENTRANT\"" -DOMPI_BUILD_FFLAGS="\"\"" -
>>>>>> DOMPI_BUILD_FCFLAGS="\"\"" -DOMPI_BUILD_LDFLAGS="\" \"" -
>>>>>> DOMPI_BUILD_LIBS="\"-lnsl -lutil -lpthread\"" -
>>>>>> DOMPI_CC_ABSOLUTE="\"/opt/apps/pgi/7.1/linux86-64/7.1-2/bin/pgcc
>>>>>> \"" - DOMPI_CXX_ABSOLUTE="\"/opt/apps/pgi/7.1/linux86-64/7.1-2/bin/
>>>>>> pgCC\"" -DOMPI_F77_ABSOLUTE="\"/opt/apps/pgi/7.1/linux86-64/7.1-2/
>>>>>> bin/ pgf77\"" -DOMPI_F90_ABSOLUTE="\"/opt/apps/pgi/7.1/
>>>>>> linux86-64/7.1-2/ bin/pgf95\"" -DOMPI_F90_BUILD_SIZE="\"small\"" -
>>>>>> I../../../.. - I../../.. -I../../../../opal/include -I../../../../
>>>>>> orte/include - I../../../../ompi/include -D_REENTRANT -O -
>>>>>> DNDEBUG -c -o version.o ../../../../ompi/tools/ompi_info/
>>>>>> version.cc
>>>>>> /bin/sh ../../../libtool --tag=CXX --mode=link pgCC -O -DNDEBUG
>>>>>> - o ompi_info components.o ompi_info.o output.o param.o
>>>>>> version.o ../../../ompi/libmpi.la -lnsl -lutil -lpthread
>>>>>> libtool: link: pgCC -O -DNDEBUG -o .libs/ompi_info components.o
>>>>>> ompi_info.o output.o param.o version.o ../../../ompi/.libs/
>>>>>> libmpi.so -L/opt/ofed/lib64 -libcm -lrdmacm -libverbs -lrt /share/
>>>>>> home/00951/paklui/ompi-trunk5/config-data1/orte/.libs/libopen-
>>>>>> rte.so /share/home/00951/paklui/ompi-trunk5/config-data1/
>>>>>> opal/.libs/ libopen-pal.so -lnuma -ldl -lnsl -lutil -lpthread -
>>>>>> Wl,--rpath -Wl,/ share/home/00951/paklui/ompi-trunk5/shared-
>>>>>> install1/lib
>>>>>>
>>>>>> [1] Exit 2 make install >&
>>>>>> make.install.log.0
>>>>>> ../../../ompi/.libs/libmpi.so: undefined reference to
>>>>>> `rdma_get_peer_addr'
>>>>>> ../../../ompi/.libs/libmpi.so: undefined reference to
>>>>>> `rdma_get_local_addr'
>>>>>> make[2]: *** [ompi_info] Error 2
>>>>>> make[2]: Leaving directory `/share/home/00951/paklui/ompi-trunk5/
>>>>>> config-data1/ompi/tools/ompi_info'
>>>>>> make[1]: *** [install-recursive] Error 1
>>>>>> make[1]: Leaving directory `/share/home/00951/paklui/ompi-trunk5/
>>>>>> config-data1/ompi'
>>>>>> make: *** [install-recursive] Error 1
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> - Pak Lui
>>>>>> pak.lui_at_[hidden]
>>>> --
>>>>
>>>>
>>>> - Pak Lui
>>>> pak.lui_at_[hidden]
>>>> <config.log.bz2><mime-attachment.txt>
>>
>
>

-- 
- Pak Lui
pak.lui_at_[hidden]