Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] autodetect broken
From: Iain Bason (Iain.Bason_at_[hidden])
Date: 2009-07-22 13:46:24


On Jul 22, 2009, at 10:55 AM, Brian W. Barrett wrote:

> The current autodetect implementation seems like the wrong approach
> to me. I'm rather unhappy the base functionality was hacked up like
> it was without any advanced notice or questions about original
> design intent. We seem to have a set of base functions which are now
> more unreadable than before, overly complex, and which leak memory.

First, I should apologize for my procedural misstep. I had asked my
group here at Sun whether I should do an RFC or something, and I guess
I didn't make my question clear enough. I was under the impression
that I should check something in and let people comment on it.

That being said, I would argue that there are good reasons for adding
some complexity to the base component:

1. The pre-existing implementation of expansion is broken (although
the cases for which it is broken are somewhat obscure).

2. The autodetect component cannot set more than one directory without
some knowledge of the relationships amongst the various directories.
Giving it that knowledge would violate the independence of the
components.

You can see #1 by doing "OPAL_PREFIX='${datarootdir}/..'
OPAL_DATAROOTDIR='/opt/SUNWhpc/HPC8.2/share' mpirun hostname" (if you
have installed in /opt/SUNWhpc/HPC8.2). Yes, I know, "Why would
anyone do that?" Nonetheless, I find that a poor excuse for having a
bug in the code.

To expand on #2 a little, consider the case where OpenMPI is
configured with "--prefix=/path/one --libdir=/path/two". We can tell
that libopen-pal is in /path/two, but it is not correct to assume that
the prefix is therefore /path. (Unfortunately, there is code in
OpenMPI that does make that sort of assumption -- see orterun.c.) We
need information from the config component to avoid making incorrect
assumptions.

There are, of course, alternate ways of getting to the same point.
But it is not feasible simply to leave the design of the base
component unchanged. (More on that below.)

As for readability, I am always open to constructive suggestions as to
how to make code more readable. I didn't fix the memory leak because
I hadn't yet found a way to do that without decreasing readability.

> The intent of the installdirs framework was to allow this type of
> behavior, but without rehacking all this infer crap into base. The
> autodetect component should just set $prefix in the set of functions
> it returns (and possibly libdir and bindir if you really want, which
> might actually make sense if you guess wrong), and let the expansion
> code take over from there. The general thought on how this would
> work went something like:
>
> - Run after config
> - If determine you have a special $prefix, set the
> opal_instal_dirs.prefix to NULL (yes, it's a bit of a hack) and
> set your special prefix.
> - Same with bindir and libdir if needed
> - Let expansion (which runs after all components have had the
> chance to fill in their fields) expand out with your special
> data

If we run the autodetect component after config, and allow it to
override values that are already in opal_install_dirs, then there will
be no way for users to have environment variables take precedence.
(Let's say someone runs with OPAL_LIBDIR=/foo. The autodetect
component will not know whether opal_install_dirs.libdir has been set
by the env component or by the config component.)

Moreover, if the user has used an environment variable to override one
of the paths in the config component, then the autodetect component
may make the wrong inference. For example, let's say someone runs
with OPAL_LIBDIR=/foo. The autodetect component finds libopen-pal in /
usr/renamed/lib, and sets opal_install_dirs.libdir to /usr/renamed/
lib. However, it has to use the config component's idea of libdir
(e.g., ${exec_prefix}/lib) to correctly infer that prefix should be /
usr/renamed. Since it only has /foo from the environment variable, it
will have to decide that it cannot infer the prefix.

All of this will lead to behavior that users will have trouble
diagnosing. While I appreciate simple code, I think that a simple
user interface is more important.

We could add some infrastructure so that the autodetect component can
figure out the provenance of each field in opal_install_dirs, but that
would make the boundary between the base component and the autodetect
component unclear.

> And the base stays simple, the components do all the heavy lifting,
> and life is happy.

Except in the cases where it doesn't work.

> I would not be opposed to putting in a "find expaneded part" type
> function that takes two strings like "${prefix}/lib" and "/usr/local/
> lib" and returns "/usr/local" being added to the base so that other
> autodetect-style components don't need to handle such a case, but
> that's about the extent of the base changes I think are appropriate.
>
> Finally, a first quick code review reveals a couple of problems:
>
> - We don't AC_SUBST variables adding .lo files to build sources in
> OMPI. Instead, we use AM_CONDITIONALS to add sources as needed.
> - Obviously, there's a problem with the _happy variable name
> consistency in configure.m4
> - There's a naming convention issue - files should all start with
> opal_installdirs_autodetect_, and a number of the added files
> do not.

Thanks. I will fix those.

> - From a finding code standpoint, I'd rather walkcontext.c and
> backtrace.c be one file with #ifs - for such short functions,
> it makes it more obvious that it's just portability implementations
> of the same function.

I'm afraid we'll just have to disagree on the aesthetics of this. I
find that #ifs almost always make code harder to follow.

However, I'm willing to hold my nose if necessary.

> I'd be happy to discuss the issues further or review any code before
> it gets committed. But I think the changes as they exist today
> (even with bugs fixed) aren't consistent with what the installdirs
> framework was trying to accomplish and should be reworked.

Let's discuss this off-alias. I have set the reply-to field
accordingly.

Iain