Open MPI logo

MTT Devel Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all MTT Devel mailing list

From: Ethan Mallove (ethan.mallove_at_[hidden])
Date: 2007-08-28 09:17:25


On Mon, Aug/27/2007 07:36:35PM, Jeff Squyres wrote:
> On Aug 27, 2007, at 5:32 PM, Ethan Mallove wrote:
>
> >> Well that's fun -- why are there no mpi_install values in the .txt
> >> file?
> >
> > Because Functions/MPI/OMPI::get_version() does not know
> > what my $bindir is. Should an &installdir() funclet be
> > created to get around this?
>
> There's a chicken-n-egg problem here that I didn't solve and
> therefore ended up hard-coding for the pre-installed MPIs (HP,
> Intel, ...).
>
> The MPI Get phase is the one responsible for getting the MPI
> version. But AlreadyInstalled applies to a bunch of different MPIs
> -- each one has a different way of obtaining the version number
> (e.g., in OMPI, we call ompi_info). So it makes sense to have a
> funclet to get the version number for each different MPI (which I did).
>
> But the problem is that you need to know the $bindir in order to call
> the MPI's utility to get the version number. You *could* do
> something like:
>
> alreadyinstalled_version = &MPI::OMPI::get_version(&mpi_install_bindir
> ())
>
> But there's two problems with this:
>
> 1. Then you have to ensure that &mpi_install_bindir() is valid from
> anywhere, meaning that we need to set some global variable that
> corresponds to the MPI install in use throughout the code base (e.g.,
> even if you call it from within test get, test build, test
> run, ...). This is a PITA, but it's solvable; it's just annoying/
> menial work to go track down everywhere in the code that needs to
> have this variable set.
>
> 2. We're trying to use an attribute from the MPI install phase (the
> bindir) in the MPI get phase (to get the version). This is a huge
> break in abstraction. All these funclets take a single param (the
> $bindir), but we won't know that until the MPI Install phase. How
> can we pass it during the MPI Get phase? I hadn't figured out how to
> do that without megga-ick/abstration breaks, so I gave up and hard-
> coded for HP/Intel MPI, etc.
>
> -----
>
> So were you using AlreadyInstalled for OMPI to report back to the
> DB?
>

I was using AlreadyInstalled for ClusterTools. I guess I can
live with this:

alreadyinstalled_dir = /opt/SUNWhpc/HPC7.0
alreadyinstalled_version = &MPI::OMPI::get_version(@alreadyinstalled_dir@)

-Ethan

>
> Because I guess I thought we all understood that this was
> currently broken (per prior discussions on the phone).
>
> > -Ethan
> >
> >
> >> I assume you had successful MPI installs before this?
> >>
> >>
> >> On Aug 27, 2007, at 4:03 PM, Ethan Mallove wrote:
> >>
> >>> I'm running into the below error running with the jms-new-parser
> >>> branch
> >>> (see attached MTTDatabase error file).
> >>>
> >>> *** WARNING: MTTDatabase server notice: mpi_install_section_name is
> >>> not in mtt database.
> >>> MTTDatabase server notice: number_of_results is not in mtt
> >>> database.
> >>> MTTDatabase server notice: phase is not in mtt database.
> >>> MTTDatabase server notice: fields is not in mtt database.
> >>> MTTDatabase server notice: mpi_get_section_name is not in mtt
> >>> database.
> >>>
> >>> MTT submission for test build
> >>> MTTDatabase server error:
> >>> SQL QUERY: SELECT mpi_install_id
> >>> FROM mpi_install NATURAL JOIN
> >>> mpi_get NATURAL JOIN
> >>> compiler NATURAL JOIN
> >>> compute_cluster NATURAL JOIN
> >>> submit
> >>> WHERE
> >>> mpi_version = DEFAULT AND
> >>> mpi_name = 'clustertools-7-iso-sdn-0907' AND
> >>> compiler_version = '5.9 2007/05/03;' AND
> >>> compiler_name = 'sun' AND
> >>> hostname = 'burl-ct-v440-2' AND
> >>> mtt_client_version = '2.1devel' AND
> >>> local_username = 'emallove' AND
> >>> platform_name = 'burl-ct-v440-2'
> >>> ORDER BY mpi_install_id DESC limit 1
> >>> SQL ERROR: ERROR: syntax error at or near "DEFAULT"
> >>> LINE 8: mpi_version = DEFAULT AND
> >>> ^
> >>> SQL ERROR:
> >>> MTTDatabase server notice: Invalid mpi_install_id (9790) given.
> >>> Guessing that it should be -1
> >>> MTTDatabase server error: ERROR: Unable to find a mpi_install
> >>> to associate with this test_build.
> >>>
> >>> MTTDatabase abort: (Tried to send HTTP error) 400
> >>> MTTDatabase abort:
> >>> No mpi_install associated with this test_build
> >>> MTTDatabase got response: MTTDatabase server notice:
> >>> mpi_install_section_name is not in mtt database.
> >>> MTTDatabase server notice: number_of_results is not in mtt
> >>> database.
> >>> MTTDatabase server notice: phase is not in mtt database.
> >>> MTTDatabase server notice: fields is not in mtt database.
> >>> MTTDatabase server notice: mpi_get_section_name is not in mtt
> >>> database.
> >>>
> >>> MTT submission for test build
> >>> MTTDatabase server error:
> >>> SQL QUERY: SELECT mpi_install_id
> >>> FROM mpi_install NATURAL JOIN
> >>> mpi_get NATURAL JOIN
> >>> compiler NATURAL JOIN
> >>> compute_cluster NATURAL JOIN
> >>> submit
> >>> WHERE
> >>> mpi_version = DEFAULT AND
> >>> mpi_name = 'clustertools-7-iso-sdn-0907' AND
> >>> compiler_version = '5.9 2007/05/03;' AND
> >>> compiler_name = 'sun' AND
> >>> hostname = 'burl-ct-v440-2' AND
> >>> mtt_client_version = '2.1devel' AND
> >>> local_username = 'emallove' AND
> >>> platform_name = 'burl-ct-v440-2'
> >>> ORDER BY mpi_install_id DESC limit 1
> >>> SQL ERROR: ERROR: syntax error at or near "DEFAULT"
> >>> LINE 8: mpi_version = DEFAULT AND
> >>> ^
> >>> SQL ERROR:
> >>> MTTDatabase server notice: Invalid mpi_install_id (9790) given.
> >>> Guessing that it should be -1
> >>> MTTDatabase server error: ERROR: Unable to find a mpi_install to
> >>> associate with this test_build.
> >>>
> >>> MTTDatabase abort: (Tried to send HTTP error) 400
> >>> MTTDatabase abort:
> >>> No mpi_install associated with this test_build
> >>> *** WARNING: MTTDatabase did not get a serial; phases will be
> >>> isolated from each other in the reports
> >>> MTTDatabase submit complete
> >>> Writing to MTTDatabase debug file: /home/em162155/mtt-utils/logs/
> >>> debug/mttdatabase.burl-ct-v440-2.20070827.153417.1.1188243271-
> >>> error.txt
> >>> Debug MTTDatabase file write complete
> >>>>> Reported to MTTDatabase: 1 successful submit, 0 failed submits
> >>>>> (total of 1 result)
> >>>
> >>> ####################################################################
> >>> ##
> >>> ######
> >>> # *** WARNING:
> >>> # 2 MTTDatabase server errors
> >>> # The data that failed to submit is in /home/em162155/mtt-utils/
> >>> logs/debug/mttdatabase.burl-ct-v440-2.20070827.153417.*.txt.
> >>> # See the above output for more info.
> >>> ####################################################################
> >>> ##
> >>> ######
> >>> <mttdatabase-error.txt>
> >>> _______________________________________________
> >>> mtt-devel mailing list
> >>> mtt-devel_at_[hidden]
> >>> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel
> >>
> >>
> >> --
> >> Jeff Squyres
> >> Cisco Systems
> >>
> >> _______________________________________________
> >> mtt-devel mailing list
> >> mtt-devel_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel
> > _______________________________________________
> > mtt-devel mailing list
> > mtt-devel_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel
>
>
> --
> Jeff Squyres
> Cisco Systems
>
> _______________________________________________
> mtt-devel mailing list
> mtt-devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel