Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI devel] undefined references forrdma_get_peer_addr & rdma_get_local_addr
From: Brian Barrett (brbarret_at_[hidden])
Date: 2008-05-04 14:13:50


I think I might see the issue. Jeff, I'm assuming you're using a
developer build of Open MPI with GNU, Intel, or Pathscale compilers,
right? At least someone below was using PGI. The first three
compilers on a developer build have the magic pixie dust arguments
added that makes calling an undeclared function an error. PGI, Sun
Workshop, and non-developer builds don't have that pixie dust. So
it's not an error to call an undeclared function in those cases, and
AC_COMPILE_IFELSE won't error out. AC_LINK_IFELSE should always be
used to check for functions for precisely that reason.

Brian

On May 4, 2008, at 11:41 AM, Jeff Squyres (jsquyres) wrote:

> As steve mentioned, its inline. But I don't understand how that
> would even compile if its not in rdma_cma.h. Iflink will catch it,
> but I'm still a little uneasy not understanding why it passes the
> compile...
>
> -jms
> Sent from my PDA. No type good.
>
> -----Original Message-----
> From: Pak Lui [mailto:Pak.Lui_at_[hidden]]
> Sent: Sunday, May 04, 2008 11:44 AM Eastern Standard Time
> To: Open MPI Developers
> Subject: Re: [OMPI devel] undefined references
> forrdma_get_peer_addr & rdma_get_local_addr
>
> Jeff Squyres wrote:
> > Jon / Steve -- can you comment?
> >
> > I tested with OFED 1.2.5 (which is what I assume you meant) and got:
> >
> > checking for rdma_get_peer_addr... no
> >
> > Because that function is not defined in OFED 1.2.5. Running with
> OFED
> > 1.3 (where the function does exist), I get:
> >
> > checking for rdma_get_peer_addr... yes
>
> For me it seems to be running with 1.2.5.
>
> login3% /opt/ofed/bin/ofed_info | head -1
> OFED-1.2.5.5
>
> No rmda_get_peer_addr or rmda_get_local_addr in these .so's,
> assumingly
> they are coming from there.
>
> login3% ls librdmacm.so*
> librdmacm.so librdmacm.so.1 librdmacm.so.1.0.0 librdmacm.so.1.0.2
>
> login3% nm librdmacm.so* | grep rdma_get_
> 0000000000003470 T rdma_get_cm_event
> 0000000000001a20 T rdma_get_devices
> 0000000000003470 T rdma_get_cm_event
> 0000000000001a20 T rdma_get_devices
> 0000000000003470 T rdma_get_cm_event
> 0000000000001a20 T rdma_get_devices
> 0000000000003470 T rdma_get_cm_event
> 0000000000001a20 T rdma_get_devices
>
> And I don't see rdma_get_peer_addr appeared in the
> /opt/ofed/include/rdma/rdma_cma.h either. Not knowing how it actually
> know about the interface (and it's not inline) there.
>
> >
> > Outside of all the configure complexity, can you write a simple
> > program that calls that function and have it compile and link
> properly?
>
> These are the references of rmda_get_peer_addr from the config.log:
> 47858 configure:120941: checking for rdma_get_peer_addr
> 47859 configure:120966: pgcc -c -g -D_REENTRANT
> -I/opt/ofed/include conftest.c >&5
> 47860 PGC-W-0155-Pointer value created from a nonlong integral type
> (conftest .c: 412)
> 47861 PGC/x86-64 Linux 7.1-2: compilation completed with warnings
> 47862 configure:120972: $? = 0
> 47863 configure:120987: result: yes
> ...
> 48355 configure:123600: checking for rdma_get_peer_addr
> 48356 configure:123625: pgcc -c -g -D_REENTRANT
> -I/opt/ofed/include conftes t.c >&5
> 48357 PGC-W-0155-Pointer value created from a nonlong integral type
> (conftest .c: 423)
> 48358 PGC/x86-64 Linux 7.1-2: compilation completed with warnings
> 48359 configure:123631: $? = 0
> 48360 configure:123646: result: yes
>
> Here's my program, not sure if it's doing it correctly. I am no m4
> expert, so how do I run the ompi_check_openib.m4 independently and see
> the conftest.c??
>
> login3% cat mytest.c
> #include "rdma/rdma_cma.h"
> int main (void) {
> void *ret = (void*) rdma_get_peer_addr((struct rdma_cm_id*)0);
> return 0;
> }
>
> It gives me a warning if I just try to create an object, which is
> what I
> see in the config.log.
>
> login3% pgcc -c -g -D_REENTRANT -I/opt/ofed/include mytest.c
> PGC-W-0155-Pointer value created from a nonlong integral type
> (mytest.c: 3)
> PGC/x86-64 Linux 7.1-2: compilation completed with warnings
> login3% echo $?
> 0
>
> But trying to create an executable would give me the error.
>
> login3% pgcc -g -D_REENTRANT -I/opt/ofed/include mytest.c -o mytest
> PGC-W-0155-Pointer value created from a nonlong integral type
> (mytest.c: 3)
> PGC/x86-64 Linux 7.1-2: compilation completed with warnings
> /tmp/pgccjF6BryhFmWS.o: In function `main':
> /share/home/00951/paklui/ompi-trunk5/config-data1-debug/mytest.c:3:
> undefined reference to `rdma_get_peer_addr'
>
> Hmm, any clues, comments?
>
> >
> > I suppose we could change the AC_COMPILE_IFELSE in config/
> > ompi_check_openib.m4 to OMPI_LINK_IFELSE, but I'm a little
> confused as
> > to why it would compile successfully if the symbol
> rdma_get_peer_addr
> > is not declared anywhere (which it shouldn't be in OFED 1.2 or
> 1.2.5,
> > AFAIK)...
> >
> >
> >
> > On May 3, 2008, at 10:56 AM, Pak Lui wrote:
> >
> >> Sure Jeff, see attached.
> >>
> >> Jeff Squyres wrote:
> >>> (moving to devel so that others are aware)
> >>> Crud. Can you send me your config.log? I don't know why it's
> able
> >>> to find rdma_get_peer_addr() in configure, but then later not
> able
> >>> to find it during the build - I'd like to see what happened
> >>> during configure.
> >>> On May 2, 2008, at 7:09 PM, Pak Lui wrote:
> >>>> Hi Jeff,
> >>>>
> >>>> It seems that the cpc3 merge causes my Ranger build to break. I
> >>>> believe it is using OFED 1.2 but I don't know how to check. It
> >>>> passes the ompi_check_openib.m4 that you added in for the
> >>>> rdma_get_peer_addr. Is there a missing #include for openib/ofed
> >>>> related somewhere?
> >>>>
> >>>>
> >>>> 1236 checking rdma/rdma_cma.h usability... yes
> >>>> 1237 checking rdma/rdma_cma.h presence... yes
> >>>> 1238 checking for rdma/rdma_cma.h... yes
> >>>> 1239 checking for rdma_create_id in -lrdmacm... yes
> >>>> 1240 checking for rdma_get_peer_addr... yes
> >>>>
> >>>>
> >>>> pgCC -DHAVE_CONFIG_H -I. -I../../../../ompi/tools/ompi_info -
> >>>> I../../../opal/include -I../../../orte/include -I../../../ompi/
> >>>> include -I../../../opal/mca/paffinity/linux/plpa/src/libplpa -
> >>>> DOMPI_CONFIGURE_USER="\"paklui\"" -
> >>>> DOMPI_CONFIGURE_HOST="\"login4.ranger.tacc.utexas.edu\"" -
> >>>> DOMPI_CONFIGURE_DATE="\"Fri May 2 17:07:01 CDT 2008\"" -
> >>>> DOMPI_BUILD_USER="\"$USER\"" -DOMPI_BUILD_HOST="\"`hostname`\"" -
> >>>> DOMPI_BUILD_DATE="\"`date`\"" -DOMPI_BUILD_CFLAGS="\"-O -DNDEBUG
> >>>> \"" -DOMPI_BUILD_CPPFLAGS="\"-I../../../.. -I../../.. -
> >>>> I../../../../ opal/include -I../../../../orte/include -
> >>>> I../../../../ompi/include - D_REENTRANT\"" -
> >>>> DOMPI_BUILD_CXXFLAGS="\"-O -DNDEBUG \"" -
> >>>> DOMPI_BUILD_CXXCPPFLAGS="\"-I../../../.. -I../../.. -
> I../../../../
> >>>> opal/include -I../../../../orte/include -I../../../../ompi/
> >>>> include - D_REENTRANT\"" -DOMPI_BUILD_FFLAGS="\"\"" -
> >>>> DOMPI_BUILD_FCFLAGS="\"\"" -DOMPI_BUILD_LDFLAGS="\" \"" -
> >>>> DOMPI_BUILD_LIBS="\"-lnsl -lutil -lpthread\"" -
> >>>> DOMPI_CC_ABSOLUTE="\"/opt/apps/pgi/7.1/linux86-64/7.1-2/bin/pgcc
> >>>> \"" - DOMPI_CXX_ABSOLUTE="\"/opt/apps/pgi/7.1/linux86-64/7.1-2/
> bin/
> >>>> pgCC\"" -DOMPI_F77_ABSOLUTE="\"/opt/apps/pgi/7.1/
> linux86-64/7.1-2/
> >>>> bin/ pgf77\"" -DOMPI_F90_ABSOLUTE="\"/opt/apps/pgi/7.1/
> >>>> linux86-64/7.1-2/ bin/pgf95\"" -DOMPI_F90_BUILD_SIZE="\"small
> \"" -
> >>>> I../../../.. - I../../.. -I../../../../opal/include -
> I../../../../
> >>>> orte/include - I../../../../ompi/include -D_REENTRANT -O -
> >>>> DNDEBUG -c -o version.o ../../../../ompi/tools/ompi_info/
> >>>> version.cc
> >>>> /bin/sh ../../../libtool --tag=CXX --mode=link pgCC -O -
> DNDEBUG
> >>>> - o ompi_info components.o ompi_info.o output.o param.o
> >>>> version.o ../../../ompi/libmpi.la -lnsl -lutil -lpthread
> >>>> libtool: link: pgCC -O -DNDEBUG -o .libs/ompi_info components.o
> >>>> ompi_info.o output.o param.o version.o ../../../ompi/.libs/
> >>>> libmpi.so -L/opt/ofed/lib64 -libcm -lrdmacm -libverbs -lrt /
> share/
> >>>> home/00951/paklui/ompi-trunk5/config-data1/orte/.libs/libopen-
> >>>> rte.so /share/home/00951/paklui/ompi-trunk5/config-data1/
> >>>> opal/.libs/ libopen-pal.so -lnuma -ldl -lnsl -lutil -lpthread -
> >>>> Wl,--rpath -Wl,/ share/home/00951/paklui/ompi-trunk5/shared-
> >>>> install1/lib
> >>>>
> >>>> [1] Exit 2 make install >&
> >>>> make.install.log.0
> >>>> ../../../ompi/.libs/libmpi.so: undefined reference to
> >>>> `rdma_get_peer_addr'
> >>>> ../../../ompi/.libs/libmpi.so: undefined reference to
> >>>> `rdma_get_local_addr'
> >>>> make[2]: *** [ompi_info] Error 2
> >>>> make[2]: Leaving directory `/share/home/00951/paklui/ompi-trunk5/
> >>>> config-data1/ompi/tools/ompi_info'
> >>>> make[1]: *** [install-recursive] Error 1
> >>>> make[1]: Leaving directory `/share/home/00951/paklui/ompi-trunk5/
> >>>> config-data1/ompi'
> >>>> make: *** [install-recursive] Error 1
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>>
> >>>> - Pak Lui
> >>>> pak.lui_at_[hidden]
> >>
> >> --
> >>
> >>
> >> - Pak Lui
> >> pak.lui_at_[hidden]
> >> <config.log.bz2><mime-attachment.txt>
> >
> >
>
>
> --
>
>
> - Pak Lui
> pak.lui_at_[hidden]
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
   Brian Barrett
   Open MPI developer
   http://www.open-mpi.org/