Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] undefined referencesforrdma_get_peer_addr & rdma_get_local_addr
From: Jeff Squyres (jsquyres) (jsquyres_at_[hidden])
Date: 2008-05-04 17:23:26


Coolio. Pak - go ahead and commit if you haven't already done so.

-jms
Sent from my PDA. No type good.

 -----Original Message-----
From: Brian Barrett [mailto:brbarret_at_[hidden]]
Sent: Sunday, May 04, 2008 02:14 PM Eastern Standard Time
To: Open MPI Developers
Subject: Re: [OMPI devel] undefined referencesforrdma_get_peer_addr & rdma_get_local_addr

I think I might see the issue. Jeff, I'm assuming you're using a
developer build of Open MPI with GNU, Intel, or Pathscale compilers,
right? At least someone below was using PGI. The first three
compilers on a developer build have the magic pixie dust arguments
added that makes calling an undeclared function an error. PGI, Sun
Workshop, and non-developer builds don't have that pixie dust. So
it's not an error to call an undeclared function in those cases, and
AC_COMPILE_IFELSE won't error out. AC_LINK_IFELSE should always be
used to check for functions for precisely that reason.

Brian

On May 4, 2008, at 11:41 AM, Jeff Squyres (jsquyres) wrote:

> As steve mentioned, its inline. But I don't understand how that
> would even compile if its not in rdma_cma.h. Iflink will catch it,
> but I'm still a little uneasy not understanding why it passes the
> compile...
>
> -jms
> Sent from my PDA. No type good.
>
> -----Original Message-----
> From: Pak Lui [mailto:Pak.Lui_at_[hidden]]
> Sent: Sunday, May 04, 2008 11:44 AM Eastern Standard Time
> To: Open MPI Developers
> Subject: Re: [OMPI devel] undefined references
> forrdma_get_peer_addr & rdma_get_local_addr
>
> Jeff Squyres wrote:
> > Jon / Steve -- can you comment?
> >
> > I tested with OFED 1.2.5 (which is what I assume you meant) and got:
> >
> > checking for rdma_get_peer_addr... no
> >
> > Because that function is not defined in OFED 1.2.5. Running with
> OFED
> > 1.3 (where the function does exist), I get:
> >
> > checking for rdma_get_peer_addr... yes
>
> For me it seems to be running with 1.2.5.
>
> login3% /opt/ofed/bin/ofed_info | head -1
> OFED-1.2.5.5
>
> No rmda_get_peer_addr or rmda_get_local_addr in these .so's,
> assumingly
> they are coming from there.
>
> login3% ls librdmacm.so*
> librdmacm.so librdmacm.so.1 librdmacm.so.1.0.0 librdmacm.so.1.0.2
>
> login3% nm librdmacm.so* | grep rdma_get_
> 0000000000003470 T rdma_get_cm_event
> 0000000000001a20 T rdma_get_devices
> 0000000000003470 T rdma_get_cm_event
> 0000000000001a20 T rdma_get_devices
> 0000000000003470 T rdma_get_cm_event
> 0000000000001a20 T rdma_get_devices
> 0000000000003470 T rdma_get_cm_event
> 0000000000001a20 T rdma_get_devices
>
> And I don't see rdma_get_peer_addr appeared in the
> /opt/ofed/include/rdma/rdma_cma.h either. Not knowing how it actually
> know about the interface (and it's not inline) there.
>
> >
> > Outside of all the configure complexity, can you write a simple
> > program that calls that function and have it compile and link
> properly?
>
> These are the references of rmda_get_peer_addr from the config.log:
> 47858 configure:120941: checking for rdma_get_peer_addr
> 47859 configure:120966: pgcc -c -g -D_REENTRANT
> -I/opt/ofed/include conftest.c >&5
> 47860 PGC-W-0155-Pointer value created from a nonlong integral type
> (conftest .c: 412)
> 47861 PGC/x86-64 Linux 7.1-2: compilation completed with warnings
> 47862 configure:120972: $? = 0
> 47863 configure:120987: result: yes
> ...
> 48355 configure:123600: checking for rdma_get_peer_addr
> 48356 configure:123625: pgcc -c -g -D_REENTRANT
> -I/opt/ofed/include conftes t.c >&5
> 48357 PGC-W-0155-Pointer value created from a nonlong integral type
> (conftest .c: 423)
> 48358 PGC/x86-64 Linux 7.1-2: compilation completed with warnings
> 48359 configure:123631: $? = 0
> 48360 configure:123646: result: yes
>
> Here's my program, not sure if it's doing it correctly. I am no m4
> expert, so how do I run the ompi_check_openib.m4 independently and see
> the conftest.c??
>
> login3% cat mytest.c
> #include "rdma/rdma_cma.h"
> int main (void) {
> void *ret = (void*) rdma_get_peer_addr((struct rdma_cm_id*)0);
> return 0;
> }
>
> It gives me a warning if I just try to create an object, which is
> what I
> see in the config.log.
>
> login3% pgcc -c -g -D_REENTRANT -I/opt/ofed/include mytest.c
> PGC-W-0155-Pointer value created from a nonlong integral type
> (mytest.c: 3)
> PGC/x86-64 Linux 7.1-2: compilation completed with warnings
> login3% echo $?
> 0
>
> But trying to create an executable would give me the error.
>
> login3% pgcc -g -D_REENTRANT -I/opt/ofed/include mytest.c -o mytest
> PGC-W-0155-Pointer value created from a nonlong integral type
> (mytest.c: 3)
> PGC/x86-64 Linux 7.1-2: compilation completed with warnings
> /tmp/pgccjF6BryhFmWS.o: In function `main':
> /share/home/00951/paklui/ompi-trunk5/config-data1-debug/mytest.c:3:
> undefined reference to `rdma_get_peer_addr'
>
> Hmm, any clues, comments?
>
> >
> > I suppose we could change the AC_COMPILE_IFELSE in config/
> > ompi_check_openib.m4 to OMPI_LINK_IFELSE, but I'm a little
> confused as
> > to why it would compile successfully if the symbol
> rdma_get_peer_addr
> > is not declared anywhere (which it shouldn't be in OFED 1.2 or
> 1.2.5,
> > AFAIK)...
> >
> >
> >
> > On May 3, 2008, at 10:56 AM, Pak Lui wrote:
> >
> >> Sure Jeff, see attached.
> >>
> >> Jeff Squyres wrote:
> >>> (moving to devel so that others are aware)
> >>> Crud. Can you send me your config.log? I don't know why it's
> able
> >>> to find rdma_get_peer_addr() in configure, but then later not
> able
> >>> to find it during the build - I'd like to see what happened
> >>> during configure.
> >>> On May 2, 2008, at 7:09 PM, Pak Lui wrote:
> >>>> Hi Jeff,
> >>>>
> >>>> It seems that the cpc3 merge causes my Ranger build to break. I
> >>>> believe it is using OFED 1.2 but I don't know how to check. It
> >>>> passes the ompi_check_openib.m4 that you added in for the
> >>>> rdma_get_peer_addr. Is there a missing #include for openib/ofed
> >>>> related somewhere?
> >>>>
> >>>>
> >>>> 1236 checking rdma/rdma_cma.h usability... yes
> >>>> 1237 checking rdma/rdma_cma.h presence... yes
> >>>> 1238 checking for rdma/rdma_cma.h... yes
> >>>> 1239 checking for rdma_create_id in -lrdmacm... yes
> >>>> 1240 checking for rdma_get_peer_addr... yes
> >>>>
> >>>>
> >>>> pgCC -DHAVE_CONFIG_H -I. -I../../../../ompi/tools/ompi_info -
> >>>> I../../../opal/include -I../../../orte/include -I../../../ompi/
> >>>> include -I../../../opal/mca/paffinity/linux/plpa/src/libplpa -
> >>>> DOMPI_CONFIGURE_USER="\"paklui\"" -
> >>>> DOMPI_CONFIGURE_HOST="\"login4.ranger.tacc.utexas.edu\"" -
> >>>> DOMPI_CONFIGURE_DATE="\"Fri May 2 17:07:01 CDT 2008\"" -
> >>>> DOMPI_BUILD_USER="\"$USER\"" -DOMPI_BUILD_HOST="\"`hostname`\"" -
> >>>> DOMPI_BUILD_DATE="\"`date`\"" -DOMPI_BUILD_CFLAGS="\"-O -DNDEBUG
> >>>> \"" -DOMPI_BUILD_CPPFLAGS="\"-I../../../.. -I../../.. -
> >>>> I../../../../ opal/include -I../../../../orte/include -
> >>>> I../../../../ompi/include - D_REENTRANT\"" -
> >>>> DOMPI_BUILD_CXXFLAGS="\"-O -DNDEBUG \"" -
> >>>> DOMPI_BUILD_CXXCPPFLAGS="\"-I../../../.. -I../../.. -
> I../../../../
> >>>> opal/include -I../../../../orte/include -I../../../../ompi/
> >>>> include - D_REENTRANT\"" -DOMPI_BUILD_FFLAGS="\"\"" -
> >>>> DOMPI_BUILD_FCFLAGS="\"\"" -DOMPI_BUILD_LDFLAGS="\" \"" -
> >>>> DOMPI_BUILD_LIBS="\"-lnsl -lutil -lpthread\"" -
> >>>> DOMPI_CC_ABSOLUTE="\"/opt/apps/pgi/7.1/linux86-64/7.1-2/bin/pgcc
> >>>> \"" - DOMPI_CXX_ABSOLUTE="\"/opt/apps/pgi/7.1/linux86-64/7.1-2/
> bin/
> >>>> pgCC\"" -DOMPI_F77_ABSOLUTE="\"/opt/apps/pgi/7.1/
> linux86-64/7.1-2/
> >>>> bin/ pgf77\"" -DOMPI_F90_ABSOLUTE="\"/opt/apps/pgi/7.1/
> >>>> linux86-64/7.1-2/ bin/pgf95\"" -DOMPI_F90_BUILD_SIZE="\"small
> \"" -
> >>>> I../../../.. - I../../.. -I../../../../opal/include -
> I../../../../
> >>>> orte/include - I../../../../ompi/include -D_REENTRANT -O -
> >>>> DNDEBUG -c -o version.o ../../../../ompi/tools/ompi_info/
> >>>> version.cc
> >>>> /bin/sh ../../../libtool --tag=CXX --mode=link pgCC -O -
> DNDEBUG
> >>>> - o ompi_info components.o ompi_info.o output.o param.o
> >>>> version.o ../../../ompi/libmpi.la -lnsl -lutil -lpthread
> >>>> libtool: link: pgCC -O -DNDEBUG -o .libs/ompi_info components.o
> >>>> ompi_info.o output.o param.o version.o ../../../ompi/.libs/
> >>>> libmpi.so -L/opt/ofed/lib64 -libcm -lrdmacm -libverbs -lrt /
> share/
> >>>> home/00951/paklui/ompi-trunk5/config-data1/orte/.libs/libopen-
> >>>> rte.so /share/home/00951/paklui/ompi-trunk5/config-data1/
> >>>> opal/.libs/ libopen-pal.so -lnuma -ldl -lnsl -lutil -lpthread -
> >>>> Wl,--rpath -Wl,/ share/home/00951/paklui/ompi-trunk5/shared-
> >>>> install1/lib
> >>>>
> >>>> [1] Exit 2 make install >&
> >>>> make.install.log.0
> >>>> ../../../ompi/.libs/libmpi.so: undefined reference to
> >>>> `rdma_get_peer_addr'
> >>>> ../../../ompi/.libs/libmpi.so: undefined reference to
> >>>> `rdma_get_local_addr'
> >>>> make[2]: *** [ompi_info] Error 2
> >>>> make[2]: Leaving directory `/share/home/00951/paklui/ompi-trunk5/
> >>>> config-data1/ompi/tools/ompi_info'
> >>>> make[1]: *** [install-recursive] Error 1
> >>>> make[1]: Leaving directory `/share/home/00951/paklui/ompi-trunk5/
> >>>> config-data1/ompi'
> >>>> make: *** [install-recursive] Error 1
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>>
> >>>> - Pak Lui
> >>>> pak.lui_at_[hidden]
> >>
> >> --
> >>
> >>
> >> - Pak Lui
> >> pak.lui_at_[hidden]
> >> <config.log.bz2><mime-attachment.txt>
> >
> >
>
>
> --
>
>
> - Pak Lui
> pak.lui_at_[hidden]
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
   Brian Barrett
   Open MPI developer
   http://www.open-mpi.org/
_______________________________________________
devel mailing list
devel_at_[hidden]
http://www.open-mpi.org/mailman/listinfo.cgi/devel