Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] 1.5rc5 has been posted
From: Larry Baker (baker_at_[hidden])
Date: 2010-08-30 18:40:00


OpenMPI 1.5rc5 fails in opal/tools/wrappers for PGI 10.3 (see http://www.open-mpi.org/community/lists/devel/2010/08/8312.php)
:

> Making all in tools/wrappers
> make[2]: Entering directory `/usr/local/src/openmpi-1.5rc5/opal/
> tools/wrappers'
> CC opal_wrapper.o
> CCLD opal_wrapper
> ../../../opal/.libs/libopen-pal.so: undefined reference to
> `pthread_create'
> ../../../opal/.libs/libopen-pal.so: undefined reference to `assert'
> ../../../opal/.libs/libopen-pal.so: undefined reference to
> `pthread_mutex_trylock'
> ../../../opal/.libs/libopen-pal.so: undefined reference to
> `pthread_atfork'
> ../../../opal/.libs/libopen-pal.so: undefined reference to
> `pthread_join'
> make[2]: *** [opal_wrapper] Error 2
> make[2]: Leaving directory `/usr/local/src/openmpi-1.5rc5/opal/tools/
> wrappers'
> make[1]: *** [all-recursive] Error 1
> make[1]: Leaving directory `/usr/local/src/openmpi-1.5rc5/opal'
> make: *** [all-recursive] Error 1

OpenMPI 1.4.2 does not have this problem. After the make for OpenMPI
1.4.2, I rm'd opal_wrapper and compared the make commands that are
issued for 1.4.2:

> [root_at_hydra wrappers]# cd opal/tools/wrappers
> [root_at_hydra wrappers]# ls
> CMakeLists.txt generic_wrapper.1in Makefile
> Makefile.in opalcc-wrapper-data.txt.in opalc++-wrapper-
> data.txt.in opal_wrapper.1 opal_wrapper.c
> generic_wrapper.1 help-opal-wrapper.txt Makefile.am opalcc-
> wrapper-data.txt opalc++-wrapper-data.txt
> opal_wrapper opal_wrapper.1in opal_wrapper.o
> [root_at_hydra wrappers]# rm opal_wrapper
> rm: remove regular file `opal_wrapper'? y
> [root_at_hydra wrappers]# make -n
> rm -f opal_wrapper
> /bin/sh ../../../libtool --tag=CC --mode=link pgcc -m64 -DNDEBUG -
> g -O3 -tp amd64 -DNO_PGI_OFFSET -export-dynamic -o opal_wrapper
> opal_wrapper.o ../../../opal/libopen-pal.la -lnsl -lutil -lpthread

I see that -lpthread is missing in the 1.5rc5 build:

> [root_at_hydra wrappers]# cd opal/tools/wrappers
> [root_at_hydra wrappers]# ls
> CMakeLists.txt help-opal-wrapper.txt Makefile.am opalcc-
> wrapper-data.txt opalc++-wrapper-data.txt opal.pc
> opal_wrapper.1in opal_wrapper.o
> generic_wrapper.1in Makefile Makefile.in opalcc-
> wrapper-data.txt.in opalc++-wrapper-data.txt.in opal.pc.in
> opal_wrapper.c
> [root_at_hydra wrappers]# make -n
> rm -f opal_wrapper
> echo " CCLD " opal_wrapper;/bin/sh ../../../libtool --silent --
> tag=CC --mode=link pgcc -m64 -DNDEBUG -g -O3 -tp amd64 -
> DNO_PGI_OFFSET -export-dynamic -o opal_wrapper
> opal_wrapper.o ../../../opal/libopen-pal.la -lnsl -lutil
> echo Creating opal_wrapper.1 man page...
> sed -e 's/#PACKAGE_NAME#/Open MPI/g' \
> -e 's/#PACKAGE_VERSION#/1.5rc5/g' \
> -e 's/#OMPI_DATE#/Aug 17, 2010/g' \
> > opal_wrapper.1 < opal_wrapper.1in

That account for all the missing pthread_* references. However, when
I manually issue the link command and supply -lpthread, assert is
still undefined:

> [root_at_hydra wrappers]# /bin/sh ../../../libtool --silent --tag=CC
> --mode=link pgcc -m64 -DNDEBUG -g -O3 -tp amd64 -DNO_PGI_OFFSET -
> export-dynamic -o opal_wrapper opal_wrapper.o ../../../opal/libopen-
> pal.la -lnsl -lutil -lpthread
> ../../../opal/.libs/libopen-pal.so: undefined reference to `assert'

I get the same result when I cut-and-paste the 1.4.2 link command:

> [root_at_hydra wrappers]# /bin/sh ../../../libtool --tag=CC --
> mode=link pgcc -m64 -DNDEBUG -g -O3 -tp amd64 -DNO_PGI_OFFSET -
> export-dynamic -o opal_wrapper opal_wrapper.o ../../../opal/
> libopen-pal.la -lnsl -lutil -lpthread
> libtool: link: pgcc -m64 -DNDEBUG -g -O3 -tp amd64 -DNO_PGI_OFFSET -
> o .libs/opal_wrapper opal_wrapper.o -Wl,--export-dynamic ../../../
> opal/.libs/libopen-pal.so -ldl -lnsl -lutil -lpthread -Wl,-rpath -
> Wl,/opt/pgi/linux86-64/10.3/openmpi/lib
> ../../../opal/.libs/libopen-pal.so: undefined reference to `assert'

I re-ran the make without my patches, and the assert() reference
disappeared:

> [root_at_hydra openmpi-1.5rc5]# tail make.log
> CCLD opal_wrapper
> ../../../opal/.libs/libopen-pal.so: undefined reference to
> `pthread_create'
> ../../../opal/.libs/libopen-pal.so: undefined reference to
> `pthread_mutex_trylock'
> ../../../opal/.libs/libopen-pal.so: undefined reference to
> `pthread_atfork'
> ../../../opal/.libs/libopen-pal.so: undefined reference to
> `pthread_join'
> make[2]: *** [opal_wrapper] Error 2
> make[2]: Leaving directory `/usr/local/src/openmpi-1.5rc5/opal/tools/
> wrappers'
> make[1]: *** [all-recursive] Error 1
> make[1]: Leaving directory `/usr/local/src/openmpi-1.5rc5/opal'
> make: *** [all-recursive] Error 1

I don't know why -- -DNDEBUG should have eliminated any declarations
from <assert.h>.

Manually adding -lpthreads makes the link error go away:

> [root_at_hydra openmpi-1.5rc5]# cd opal/tools/wrappers
> [root_at_hydra wrappers]# /bin/sh ../../../libtool --silent --tag=CC
> --mode=link pgcc -m64 -DNDEBUG -g -O3 -tp amd64 -DNO_PGI_OFFSET -
> export-dynamic -o opal_wrapper opal_wrapper.o ../../../opal/libopen-
> pal.la -lnsl -lutil -lpthread

It looks like the changes in the opal/tools/wrappers/Makefile
(configure/automake?) from 1.4.2 to 1.5rc5 are not supplying the
pthreads library correctly to the link step.

Larry Baker
US Geological Survey
650-329-5608
baker_at_[hidden]

On Aug 17, 2010, at 2:18 PM, Jeff Squyres wrote:

> We still have one known possible regression:
>
> https://svn.open-mpi.org/trac/ompi/ticket/2530
>
> But we posted rc5 anyway (there's a bunch of stuff that has been
> pending for a while that is now in). Please test!
>
> http://www.open-mpi.org/software/ompi/v1.5/
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel