FWIW: I can build it fine without setting any of the CC... flags on LANL's Cray XE6, and mpicc worked just fine for me once built that way.

So I'm not quite sure I understand the "mpicc is completely borked in the trunk". Can you elaborate?

On Jan 25, 2013, at 3:59 PM, Paul Hargrove <phhargrove@lbl.gov> wrote:

Nathan,

The 2nd and 3rd non-blank lines of my original post:
Given that it is INTENDED to be API-compatible with the XE series, I began configuring with
    CC=cc CXX=CC FC=ftn --with-platform=lanl/cray_xe6/optimized-nopanasas

So, I am surprised that nobody objected before now to my use of the Cray-provided wrapper compilers.
I mistakenly believed that if I don't use them then I wouldn't get through configure w/ ugni and alps support.
However, I've just now completed configure w/o setting CC, CXX, FC and see the expected components.
I'll report more from this build later ("make all" is running now).

I would appreciate (perhaps off-list) receiving any module or platform file or additional instructions that maybe appropriate to building on a Cray XE, XK or XC system.

Getting OMPI running on our XC30 is of exactly ZERO importance beyond my own edification.
So, I am likely to stop fighting this battle soon.

-Paul


On Fri, Jan 25, 2013 at 3:21 PM, Nathan Hjelm <hjelmn@lanl.gov> wrote:
Hmm, I see mpicc in there. It will use the compiler directly instead of Cray's wrappers. We didn't want Open MPI's wrapper linking in MPT afterall ;). mpicc is completely borked in the trunk.

If you want to use the Cray wrappers with Open MPI I can give you a module file that sets up the environment correctly (link against -lmpi not -lmpich, etc).

-Nathan

On Fri, Jan 25, 2013 at 03:10:37PM -0800, Paul Hargrove wrote:
> Nathan,
>
> Cray's "cc" wrapper is adding xpmem, ugni, pmi, alps and others already:
>
> $ cc -v hello.c 2>&1 | grep collect
> >  /opt/gcc/4.7.2/snos/libexec/gcc/x86_64-suse-linux/4.7.2/collect2
> > --sysroot= -m elf_x86_64 -static -u pthread_mutex_trylock -u
> > pthread_mutex_destroy -u pthread_create /usr/lib/../lib64/crt1.o
> > /usr/lib/../lib64/crti.o
> > /opt/gcc/4.7.2/snos/lib/gcc/x86_64-suse-linux/4.7.2/crtbeginT.o
> > -L/opt/cray/udreg/2.3.2-1.0500.5931.3.1.ari/lib64
> > -L/opt/cray/ugni/4.0-1.0500.5836.7.58.ari/lib64
> > -L/opt/cray/pmi/4.0.0-1.0000.9282.69.4.ari/lib64
> > -L/opt/cray/dmapp/4.0.1-1.0500.5932.6.5.ari/lib64
> > -L/opt/cray/xpmem/0.1-2.0500.36799.3.6.ari/lib64
> > -L/opt/cray/alps/5.0.1-2.0500.7663.1.1.ari/lib64
> > -L/opt/cray/rca/1.0.0-2.0500.37705.3.12.ari/lib64
> > -L/opt/cray/mpt/5.6.0/gni/mpich2-gnu/47/lib
> > -L/opt/cray/mpt/5.6.0/gni/sma/lib64
> > -L/opt/cray/libsci/12.0.00/gnu/47/sandybridge/lib
> > -L/opt/cray/alps/5.0.1-2.0500.7663.1.1.ari/lib64
> > -L/opt/gcc/4.7.2/snos/lib/gcc/x86_64-suse-linux/4.7.2
> > -L/opt/gcc/4.7.2/snos/lib/gcc/x86_64-suse-linux/4.7.2/../../../../lib64
> > -L/lib/../lib64 -L/usr/lib/../lib64
> > -L/opt/gcc/4.7.2/snos/lib/gcc/x86_64-suse-linux/4.7.2/../../..
> > /scratch1/scratchdirs/hargrove/ccQ1f0sx.o -lrca -L/opt/cray/atp/1.6.0/lib/
> > --undefined=_ATP_Data_Globals --undefined=__atpHandlerInstall
> > -lAtpSigHCommData -lAtpSigHandler --start-group -lgfortran -lscicpp_gnu
> > -lsci_gnu_mp -lstdc++ -lgfortran -lmpich_gnu_47 -lmpl -lrt -lsma -lxpmem
> > -ldmapp -lugni -lpmi -lalpslli -lalpsutil -lalps -ludreg -lpthread -lm
> > --end-group -lgomp -lpthread --start-group -lgcc -lgcc_eh -lc --end-group
> > /opt/gcc/4.7.2/snos/lib/gcc/x86_64-suse-linux/4.7.2/crtend.o
> > /usr/lib/../lib64/crtn.o
>
>
> -Paul
>
>
> On Fri, Jan 25, 2013 at 2:46 PM, Nathan Hjelm <hjelmn@lanl.gov> wrote:
>
> > Something is wrong with the wrappers. A number of libraries (-lxpmem,
> > -lugni, etc) are missing from libs_static. Might be a similar issue with eh
> > missing -llustreapi. Going to create a critical bug to track this issue.
> >
> > Works in 1.7 :-/ ... If you add -lnuma to libs_static in
> > mpicc-wrapper-data.txt.
> >
> > -Nathan
> > HPC-3, LANL
> >
> > On Fri, Jan 25, 2013 at 02:13:41PM -0800, Paul Hargrove wrote:
> > > Still having problems on the Cray XC30, but now they are when linking an
> > > MPI app:
> > >
> > > $ ./INSTALL/bin/mpicc -o ring_c examples/ring_c.c
> > > > fs_lustre_file_open.c:(.text+0x130): undefined reference to
> > > > `llapi_file_create'
> > > > fs_lustre_file_open.c:(.text+0x17e): undefined reference to
> > > > `llapi_file_get_stripe'
> > > > /usr/bin/ld: link errors found, deleting executable `ring_c'
> > > > collect2: error: ld returned 1 exit status
> > >
> > >
> > > It appears that lustre support was found at configure time using a test
> > > that used "-llustre -llusterapi":
> > >
> > > > configure:157666: checking if possible to link LUSTRE
> > > > configure:157680: cc -std=gnu99 -o conftest -O3 -DNDEBUG
> > > > -finline-functions -fno-strict-aliasing -fexceptions   -D_REENTRANT
> > > >
> > -I/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.9a1r27905/opal/mca/hwloc/hwloc151/hwloc/include
> > > >
> > -I/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.9a1r27905/BUILD-edison/opal/mca/hwloc/hwloc151/hwloc/include
> > > >
> > -I/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.9a1r27905/opal/mca/event/libevent2019/libevent
> > > >
> > -I/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.9a1r27905/opal/mca/event/libevent2019/libevent/include
> > > >
> > -I/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.9a1r27905/BUILD-edison/opal/mca/event/libevent2019/libevent/include
> > > > -I/opt/cray/pmi/default/include -I/opt/cray/pmi/default/include
> > > > -I/opt/cray/pmi/default/include -I/opt/cray/pmi/default/include
> > > > -I/usr//include/lustre/   -fexceptions  -L/usr//lib64 conftest.c  -lnsl
> > > >  -lutil  -lnsl  -lutil   -llustre -llustreapi
> > >
> > >
> > > However, those two libs are NOT included when linking an MPI application:
> > >
> > > > $ ./INSTALL/bin/mpicc -o ring_c examples/ring_c.c -v 2>&1 | grep
> > collect
> > > >  /opt/gcc/4.7.2/snos/libexec/gcc/x86_64-suse-linux/4.7.2/collect2
> > > > --sysroot= -m elf_x86_64 -static -o ring_c -u pthread_mutex_trylock -u
> > > > pthread_mutex_destroy -u pthread_create /usr/lib/../lib64/crt1.o
> > > > /usr/lib/../lib64/crti.o
> > > > /opt/gcc/4.7.2/snos/lib/gcc/x86_64-suse-linux/4.7.2/crtbeginT.o
> > > > -L/opt/cray/pmi/default/lib64 -L/opt/cray/alps/default/lib64
> > > >
> > -L/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.9a1r27905/INSTALL/lib
> > > > -L/opt/cray/udreg/2.3.2-1.0500.5931.3.1.ari/lib64
> > > > -L/opt/cray/ugni/4.0-1.0500.5836.7.58.ari/lib64
> > > > -L/opt/cray/pmi/4.0.0-1.0000.9282.69.4.ari/lib64
> > > > -L/opt/cray/dmapp/4.0.1-1.0500.5932.6.5.ari/lib64
> > > > -L/opt/cray/xpmem/0.1-2.0500.36799.3.6.ari/lib64
> > > > -L/opt/cray/alps/5.0.1-2.0500.7663.1.1.ari/lib64
> > > > -L/opt/cray/rca/1.0.0-2.0500.37705.3.12.ari/lib64
> > > > -L/opt/cray/mpt/5.6.0/gni/mpich2-gnu/47/lib
> > > > -L/opt/cray/mpt/5.6.0/gni/sma/lib64
> > > > -L/opt/cray/libsci/12.0.00/gnu/47/sandybridge/lib
> > > > -L/opt/cray/alps/5.0.1-2.0500.7663.1.1.ari/lib64
> > > > -L/opt/gcc/4.7.2/snos/lib/gcc/x86_64-suse-linux/4.7.2
> > > > -L/opt/gcc/4.7.2/snos/lib/gcc/x86_64-suse-linux/4.7.2/../../../../lib64
> > > > -L/lib/../lib64 -L/usr/lib/../lib64
> > > > -L/opt/gcc/4.7.2/snos/lib/gcc/x86_64-suse-linux/4.7.2/../../..
> > > > /scratch1/scratchdirs/hargrove/cceRJNtp.o -lmpi -lpmi -lalpslli
> > -lalpsutil
> > > > -lnsl -lutil -lnsl -lutil -lopen-rte -lpmi -lalpslli -lalpsutil -lnsl
> > > > -lutil -lnsl -lutil -lopen-pal -lpmi -lalpslli -lalpsutil -lnsl -lutil
> > > > -lnsl -lutil -lrca -L/opt/cray/atp/1.6.0/lib/
> > --undefined=_ATP_Data_Globals
> > > > --undefined=__atpHandlerInstall -lAtpSigHCommData -lAtpSigHandler
> > > > --start-group -lgfortran -lscicpp_gnu -lsci_gnu_mp -lstdc++ -lgfortran
> > > > -lmpich_gnu_47 -lmpl -lrt -lsma -lxpmem -ldmapp -lugni -lpmi -lalpslli
> > > > -lalpsutil -lalps -ludreg -lpthread -lm --end-group -lgomp -lpthread
> > > > --start-group -lgcc -lgcc_eh -lc --end-group
> > > > /opt/gcc/4.7.2/snos/lib/gcc/x86_64-suse-linux/4.7.2/crtend.o
> > > > /usr/lib/../lib64/crtn.o
> > > > collect2: error: ld returned 1 exit status
> > >
> > >
> > > Of course the obvious work-around to try is adding "-llustre -llustreapi"
> > > to my command line.  However, that doesn't work because mpicc places my
> > > "-l" args BEFORE its own "-lmpi".  Since "-static" is also among the
> > > arguments, no symbols are picked up from the luster libs when they appear
> > > on the command line before "-lmpi", from which lustre symbols are
> > > referenced.
> > >
> > > Best guess(es):
> > > EITHER config/ompi_check_lustre.m4 is failing to add "-llustre
> > -llustreapi"
> > > to some variable
> > > OR the variable set by config/ompi_check_lustre.m4 isn't making its way
> > > into the application link command for some reason
> > >
> > > Note that this is a --disable-shared/--enable-static build which may
> > differ
> > > from other systems where LUSTRE support gets used/tested.
> > >
> > > -Paul
> > >
> > >
> > > On Fri, Jan 25, 2013 at 12:01 PM, Ralph Castain <rhc@open-mpi.org>
> > wrote:
> > >
> > > > Thanks Paul
> > > >
> > > > I'm currently tracking down a problem on the Cray XE6 - it appears that
> > > > recent OS release changed the way alps stores allocation info :-(
> > > >
> > > > Will hopefully have it running soon.
> > > >
> > > > On Jan 25, 2013, at 10:50 AM, Paul Hargrove <phhargrove@lbl.gov>
> > wrote:
> > > >
> > > > I was able to compile with openmpi-1.9a1r27905.tar.bz
> > > >
> > > > I'll report again when I've had an opportunity to run something like
> > > > ring_c.
> > > >
> > > > Thanks,
> > > > -Paul
> > > >
> > > >
> > > > On Tue, Jan 22, 2013 at 6:08 PM, Ralph Castain <rhc@open-mpi.org>
> > wrote:
> > > >
> > > >> I went ahead and removed the duplicate code, so this should work now.
> > The
> > > >> problem is that we re-factored the ompi_info/orte-info code, but
> > didn't
> > > >> complete the job - specifically, the orte-info tool didn't get
> > updated.
> > > >> It's about to get revamped yet again when the ompi-rte branch gets
> > > >> committed to the trunk, so I'd rather not do any more with it now.
> > > >>
> > > >> Hopefully, this will be the minimum required.
> > > >>
> > > >>
> > > >> On Jan 22, 2013, at 4:20 PM, Paul Hargrove <phhargrove@lbl.gov>
> > wrote:
> > > >>
> > > >> I am using the openmpi-1.9a1r27886 tarball and I still see an error
> > for
> > > >> one of the two duplicate symbols:
> > > >>
> > > >>   CCLD     orte-info
> > > >> ../../../orte/.libs/libopen-rte.a(orte_info_support.o): In function
> > > >> `orte_info_show_orte_version':
> > > >> ../../orte/runtime/orte_info_support.c:(.text+0xe10): multiple
> > definition
> > > >> of `orte_info_show_orte_version'
> > > >> version.o:../../../../orte/tools/orte-info/version.c:(.text+0x2370):
> > > >> first defined here
> > > >>
> > > >> -Paul
> > > >>
> > > >>
> > > >> On Fri, Jan 18, 2013 at 3:52 AM, George Bosilca <bosilca@icl.utk.edu
> > >wrote:
> > > >>
> > > >>> Luckily for us all the definitions contain the same constant (orte).
> > > >>> r27864 should fix this.
> > > >>>
> > > >>>   George.
> > > >>>
> > > >>>
> > > >>> On Jan 18, 2013, at 06:21 , Paul Hargrove <PHHargrove@lbl.gov>
> > wrote:
> > > >>>
> > > >>> My employer has a nice new Cray XC30 (aka Cascade), and I thought I'd
> > > >>> give Open MPI a quick test.
> > > >>>
> > > >>> Given that it is INTENDED to be API-compatible with the XE series, I
> > > >>> began configuring with
> > > >>>     CC=cc CXX=CC FC=ftn
> > --with-platform=lanl/cray_xe6/optimized-nopanasas
> > > >>> However, since this is Intel h/w, I commented-out the following 2
> > lines
> > > >>> in the platform file:
> > > >>>     with_wrapper_cflags="-march=amdfam10"
> > > >>>     CFLAGS=-march=amdfam10
> > > >>>
> > > >>> I am using PrgEnv-gnu/5.0.15, though PrgEnv-intel is the default on
> > our
> > > >>> system
> > > >>>
> > > >>> As far as I know, use of 1.6.x is out - no ugni at all, right?
> > > >>> So, I didn't even try.
> > > >>>
> > > >>> I gave openmpi-1.7rc6 a try, but the ALPS headers and libs have moved
> > > >>> (as mentioned in ompi-trunk/config/orte_check_alps.m4).
> > > >>> Perhaps one should CMR the updated-for-CLE-5 configure logic to the
> > 1.7
> > > >>> branch?
> > > >>>
> > > >>> Next, I tried a trunk nightly tarball: openmpi-1.9a1r27862.tar.bz2
> > > >>> As I mentioned above, the trunk has the right logic for locating
> > ALPS.
> > > >>> However, it looks like there is some untested code, protected by "#if
> > > >>> WANT_CRAY_PMI2_EXT", that needs work:
> > > >>>
> > > >>> make[2]: Entering directory
> > > >>>
> > `/global/scratch/sd/hargrove/OMPI/openmpi-1.9a1r27862/BUILD/orte/mca/db/pmi'
> > > >>>   CC       db_pmi_component.lo
> > > >>>   CC       db_pmi.lo
> > > >>> ../../../../../orte/mca/db/pmi/db_pmi.c: In function 'store':
> > > >>> ../../../../../orte/mca/db/pmi/db_pmi.c:202: error: 'ptr' undeclared
> > > >>> (first use in this function)
> > > >>> ../../../../../orte/mca/db/pmi/db_pmi.c:202: error: (Each undeclared
> > > >>> identifier is reported only once
> > > >>> ../../../../../orte/mca/db/pmi/db_pmi.c:202: error: for each
> > function it
> > > >>> appears in.)
> > > >>> make[2]: *** [db_pmi.lo] Error 1
> > > >>> make[2]: Leaving directory
> > > >>>
> > `/global/scratch/sd/hargrove/OMPI/openmpi-1.9a1r27862/BUILD/orte/mca/db/pmi'
> > > >>> make[1]: *** [all-recursive] Error 1
> > > >>> make[1]: Leaving directory
> > > >>> `/global/scratch/sd/hargrove/OMPI/openmpi-1.9a1r27862/BUILD/orte'
> > > >>> make: *** [all-recursive] Error 1
> > > >>>
> > > >>> I added the missing "char *ptr" declaration a few lines before it's
> > > >>> first use, and resumed the build.
> > > >>> This time the build terminated at
> > > >>>
> > > >>> make[2]: Entering directory
> > > >>>
> > `/global/scratch/sd/hargrove/OMPI/openmpi-1.9a1r27862/BUILD/opal/tools/wrappers'
> > > >>>   CC       opal_wrapper.o
> > > >>>   CCLD     opal_wrapper
> > > >>> /usr/bin/ld: attempted static link of dynamic object
> > > >>> `../../../opal/.libs/libopen-pal.so'
> > > >>> collect2: error: ld returned 1 exit status
> > > >>>
> > > >>> So I went back to the platform file and changed
> > > >>>    enable_shared=yes
> > > >>> to
> > > >>>    enable_shared=no
> > > >>> No big deal there - I had to make the same change for our XE6.
> > > >>>
> > > >>> And so I started back at configure (after a "make distclean", to be
> > > >>> safe), and here is the next error:
> > > >>>
> > > >>> Making all in tools/orte-info
> > > >>> make[2]: Entering directory
> > > >>>
> > `/global/scratch/sd/hargrove/OMPI/openmpi-1.9a1r27862/BUILD/orte/tools/orte-info'
> > > >>>   CCLD     orte-info
> > > >>> ../../../orte/.libs/libopen-rte.a(orte_info_support.o): In function
> > > >>> `orte_info_show_orte_version':
> > > >>> orte_info_support.c:(.text+0xd70): multiple definition of
> > > >>> `orte_info_show_orte_version'
> > > >>> version.o:version.c:(.text+0x4b0): first defined here
> > > >>> ../../../orte/.libs/libopen-rte.a(orte_info_support.o):(.data+0x0):
> > > >>> multiple definition of `orte_info_type_orte'
> > > >>> orte-info.o:(.data+0x10): first defined here
> > > >>> /usr/bin/ld: link errors found, deleting executable `orte-info'
> > > >>> collect2: error: ld returned 1 exit status
> > > >>> make[2]: *** [orte-info] Error 1
> > > >>>
> > > >>> I am not sure how to fix this, but I would guess this is probably a
> > > >>> simple fix for somebody who knows OMPI's build infrastructure better
> > than I.
> > > >>>
> > > >>> -Paul
> > > >>>
> > > >>> --
> > > >>> Paul H. Hargrove                          PHHargrove@lbl.gov
> > > >>> Future Technologies Group
> > > >>> Computer and Data Sciences Department     Tel: +1-510-495-2352
> > > >>> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
> > > >>>  _______________________________________________
> > > >>> devel mailing list
> > > >>> devel@open-mpi.org
> > > >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > > >>>
> > > >>>
> > > >>>
> > > >>> _______________________________________________
> > > >>> devel mailing list
> > > >>> devel@open-mpi.org
> > > >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > > >>>
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >> Paul H. Hargrove                          PHHargrove@lbl.gov
> > > >> Future Technologies Group
> > > >> Computer and Data Sciences Department     Tel: +1-510-495-2352
> > > >> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
> > > >>  _______________________________________________
> > > >> devel mailing list
> > > >> devel@open-mpi.org
> > > >> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > > >>
> > > >>
> > > >>
> > > >> _______________________________________________
> > > >> devel mailing list
> > > >> devel@open-mpi.org
> > > >> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > Paul H. Hargrove                          PHHargrove@lbl.gov
> > > > Future Technologies Group
> > > > Computer and Data Sciences Department     Tel: +1-510-495-2352
> > > > Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
> > > >  _______________________________________________
> > > > devel mailing list
> > > > devel@open-mpi.org
> > > > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > > >
> > > >
> > > >
> > > > _______________________________________________
> > > > devel mailing list
> > > > devel@open-mpi.org
> > > > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > > >
> > >
> > >
> > >
> > > --
> > > Paul H. Hargrove                          PHHargrove@lbl.gov
> > > Future Technologies Group
> > > Computer and Data Sciences Department     Tel: +1-510-495-2352
> > > Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
> >
> > > _______________________________________________
> > > devel mailing list
> > > devel@open-mpi.org
> > > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >
> > _______________________________________________
> > devel mailing list
> > devel@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >
>
>
>
> --
> Paul H. Hargrove                          PHHargrove@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department     Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900

> _______________________________________________
> devel mailing list
> devel@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

_______________________________________________
devel mailing list
devel@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Paul H. Hargrove                          PHHargrove@lbl.gov
Future Technologies Group
Computer and Data Sciences Department     Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
_______________________________________________
devel mailing list
devel@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel