Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Open MPI on Cray XC30 - suspicous configury
From: Nathan Hjelm (hjelmn_at_[hidden])
Date: 2013-01-29 10:48:09


Opps, that was my mistake. I wrote a fix for the CLE5 and --with-alps=<dir> code but I never pushed it. r27962 should fix the issue.

-Nathan

On Mon, Jan 28, 2013 at 09:05:32PM -0800, Ralph Castain wrote:
> Thanks Paul - appreciate the help! I chatted with Nathan this evening and now have a much better understanding of the problem driving the code. We are going to review it tomorrow. Hope to have a fix shortly.
>
>
> On Jan 28, 2013, at 9:01 PM, Paul Hargrove <phhargrove_at_[hidden]> wrote:
>
> > It looks now like the very first line of ORTE_CHECK_ALPS is actually the one that is preventing $1_CPPFLAGS from getting set for any caller other than the first:
> > if test -z "$orte_check_alps_happy"; then
> >
> > So, my previous patch (tested by editing configure directly) didn't do the job.
> >
> > Again, this probably slipped past Nathan because under CLE4 the alps headers are under /usr/include and therefore the missing CPPFLAGS were not actually required.
> >
> > -Paul
> >
> >
> >
> > On Mon, Jan 28, 2013 at 7:05 PM, Paul Hargrove <phhargrove_at_[hidden]> wrote:
> > Ralph and Nathan,
> >
> > As I said, the results I see fail to match the actual ALPS header locations on both CLE4 and CLE5 systems at NERSC.
> > However, the CLE4 system "just works" because the actual location (/usr/include) gets searched no matter what value configure picks for $orte_check_alps_dir. I suspect that this is why you didn't see any errors on LANL's system.
> >
> > Regardless of the defaults, there is still an additional issue with orte_check_alps.m4 that occurs when I give an explicit with-alps=/opt/cray/alps/default in the platform file, which the following bit of config.log confirms:
> > configure:99227: checking --with-alps value
> > configure:99247: result: sanity check ok (/opt/cray/alps/default)
> > configure:99329: checking for alps libraries in "/opt/cray/alps/default/lib64"
> > configure:99334: result: found
> >
> >
> > However, when trying to configure the ras:alps component, the value of ras_alps_CPPFLAGS does not contain "-I/opt/cray/alps/default/include" as I would have expected from reading the relevant .m4 files and the generated configure script:
> > configure:113697: checking for MCA component ras:alps compile mode
> > configure:113703: result: static
> > configure:113871: checking alps/apInfo.h usability
> > configure:113871: gcc -std=gnu99 -c -O3 -DNDEBUG -march=amdfam10 -finline-functions -fno-strict-aliasing -fexceptions -pthread -I/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.9a1r27905/opal/mca/hwloc/hwloc151/hwloc/include -I/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.9a1r27905/BUILD-edison-gcc/opal/mca/hwloc/hwloc151/hwloc/include -I/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.9a1r27905/opal/mca/event/libevent2019/libevent -I/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.9a1r27905/opal/mca/event/libevent2019/libevent/include -I/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.9a1r27905/BUILD-edison-gcc/opal/mca/event/libevent2019/libevent/include -I/opt/cray/pmi/default/include -I/opt/cray/pmi/default/include -I/opt/cray/pmi/default/include -I/opt/cray/pmi/default/include conftest.c >&5
> > conftest.c:640:25: fatal error: alps/apInfo.h: No such file or directory
> > compilation terminated.
> > configure:113871: $? = 1
> >
> > While only 95% certain, I think that this logic in config/orte_check_alps.m4 is to blame:
> > if test "$with_alps" = "no" -o -z "$with_alps" ; then
> > orte_check_alps_happy="no"
> > else
> > # Only need to do these tests once (this macro is invoked
> > # from multiple different components' configure.m4 scripts
> >
> > Specifically, the setting of "$1_CPPFLAGS" appears to be ERRONEOUSLY placed within the else-clause of the logic above. So, when orte/mca/ess/alps/configure.m4 is run BEFORE orte/mca/ras/alps/configure.m4, the variable "with_alps" gets set and the "$1_CPPFLAGS=..." is then unreachable when the ORTE_CHECK_ALPS macro is run later from config/orte_check_alps.m4.
> >
> > Though it leaves the indentation sloppy, I believe the following might fix the problem, but I lack the autotools versions to test this myself:
> >
> > --- config/orte_check_alps.m4 (revision 27954)
> > +++ config/orte_check_alps.m4 (working copy)
> > @@ -80,6 +80,7 @@
> > [orte_check_alps_dir="/opt/cray/alps/default"],
> > [orte_check_alps_dir="$with_alps"])
> > fi
> > + fi
> >
> > $1_CPPFLAGS="-I$orte_check_alps_dir/include"
> > $1_LDFLAGS="-L$orte_check_alps_libdir"
> > @@ -106,7 +107,6 @@
> > AC_MSG_ERROR([Cannot continue])])
> > fi
> > fi
> > - fi
> > fi
> >
> > AS_IF([test "$orte_check_alps_happy" = "yes"],
> >
> >
> > -Paul
> >
> >
> >
> >
> > On Mon, Jan 28, 2013 at 6:30 PM, Ralph Castain <rhc_at_[hidden]> wrote:
> > Like I said, I didn't write this code - all I can say for certain is that it gets the right answer on the LANL Crays. I'll talk to Nathan (the author) about it tomorrow.
> >
> > On Jan 28, 2013, at 6:23 PM, Paul Hargrove <phhargrove_at_[hidden]> wrote:
> >
> >> Ralph writes
> >> ?? It looks correct to me - if with_alps is "yes", then no path was given and we have to look at a default location. If it isn't yes, then a path was given and we use it.
> >> Am I missing something?
> >>
> >> Maybe *I* am the one missing something, but the way I read it the following defaults are applied
> >>
> >> CLE4:
> >> orte_check_alps_libdir="/usr/lib/alps"
> >> orte_check_alps_dir="/opt/cray/alps/default"
> >> CLE5:
> >> orte_check_alps_libdir="/opt/cray/alps/default/lib64"
> >> orte_check_alps_dir="/usr"
> >>
> >> Unless I am mistaken, the defaults for orte_check_alps_dir should be exchanged to yield:
> >>
> >> CLE4:
> >> orte_check_alps_libdir="/usr/lib/alps"
> >> orte_check_alps_dir="/usr"
> >> CLE5:
> >> orte_check_alps_libdir="/opt/cray/alps/default/lib64"
> >> orte_check_alps_dir="/opt/cray/alps/default"
> >>
> >> -Paul
> >>
> >>
> >> On Mon, Jan 28, 2013 at 6:14 PM, Ralph Castain <rhc_at_[hidden]> wrote:
> >>
> >> On Jan 28, 2013, at 6:10 PM, Paul Hargrove <phhargrove_at_[hidden]> wrote:
> >>
> >>> The following 2 fragment from config/orte_check_alps.m4 appear to be contradictory.
> >>> By that I mean the first appears to mean that "--with-alps" with no argument means /opt/cray/alps/default/... for CLE5 and /usr/... for CLE4, while the second fragment appears to be doing the opposite:
> >>>
> >>> if test "$using_cle5_install" = "yes"; then
> >>> orte_check_alps_libdir="/opt/cray/alps/default/lib64"
> >>> else
> >>> orte_check_alps_libdir="/usr/lib/alps"
> >>> fi
> >>>
> >>>
> >>> if test "$using_cle5_install" = "yes" ; then
> >>> AS_IF([test "$with_alps" = "yes"],
> >>> [orte_check_alps_dir="/usr"],
> >>> [orte_check_alps_dir="$with_alps"])
> >>> else
> >>> AS_IF([test "$with_alps" = "yes"],
> >>> [orte_check_alps_dir="/opt/cray/alps/default"],
> >>> [orte_check_alps_dir="$with_alps"])
> >>> fi
> >>>
> >>> At least based on header and lib locations on NERSC's XC30 (CLE 5.0.15) and XE6 (CLE 4.1.40), the first fragment is correctwhile the second fragment is "backwards" (the two calls to AS_IF should be exchanged, or the initial "test" should be inverted).
> >>
> >> ?? It looks correct to me - if with_alps is "yes", then no path was given and we have to look at a default location. If it isn't yes, then a path was given and we use it.
> >>
> >> Am I missing something?
> >>
> >>>
> >>> Note this same logic is present in both trunk and v1.7 (in SVN - I am not looking at tarballs this time).
> >>>
> >>> -Paul
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>> Paul H. Hargrove PHHargrove_at_[hidden]
> >>> Future Technologies Group
> >>> Computer and Data Sciences Department Tel: +1-510-495-2352
> >>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> >>> _______________________________________________
> >>> devel mailing list
> >>> devel_at_[hidden]
> >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >>
> >>
> >> _______________________________________________
> >> devel mailing list
> >> devel_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >>
> >>
> >>
> >> --
> >> Paul H. Hargrove PHHargrove_at_[hidden]
> >> Future Technologies Group
> >> Computer and Data Sciences Department Tel: +1-510-495-2352
> >> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> >> _______________________________________________
> >> devel mailing list
> >> devel_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >
> >
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >
> >
> >
> > --
> > Paul H. Hargrove PHHargrove_at_[hidden]
> > Future Technologies Group
> > Computer and Data Sciences Department Tel: +1-510-495-2352
> > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> >
> >
> >
> > --
> > Paul H. Hargrove PHHargrove_at_[hidden]
> > Future Technologies Group
> > Computer and Data Sciences Department Tel: +1-510-495-2352
> > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel