Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] parallel make install
From: Ralph Castain (rhc_at_[hidden])
Date: 2008-06-03 17:59:07


Very interesting. Don't know if it's the same problem, but I noted an issue
quite a while ago where make -jN all/install would fail when traversing
opal. I built a workaround that was just a script that does make all in
opal, then goes back to make -jN for orte/ompi.

Perhaps this would fix that problem too....

Thanks Ralf!

On 6/3/08 3:53 PM, "Ralf Wildenhues" <Ralf.Wildenhues_at_[hidden]> wrote:

> Hi Jeff,
>
> * Jeff Squyres wrote on Tue, Jun 03, 2008 at 11:11:32PM CEST:
>> ERROR: Command returned a non-zero exist status
>> make -j 4 distcheck
> [...]
>> Making install in etc
>> make[3]: Entering directory
>> `/home/mpiteam/openmpi/nightly-tarball-build-root/trunk/create-r18551/ompi/op
>> enmpi-1.3a1r18551/_build/opal/etc'
>> make[4]: Entering directory
>> `/home/mpiteam/openmpi/nightly-tarball-build-root/trunk/create-r18551/ompi/op
>> enmpi-1.3a1r18551/_build/opal/etc'
>> test -z
>> "/home/mpiteam/openmpi/nightly-tarball-build-root/trunk/create-r18551/ompi/op
>> enmpi-1.3a1r18551/_inst/etc" || /bin/mkdir -p
>> "/home/mpiteam/openmpi/nightly-tarball-build-root/trunk/create-r18551/ompi/op
>> enmpi-1.3a1r18551/_inst/etc"
>> /usr/bin/install -c -m 644 ../../../opal/etc/openmpi-mca-params.conf
>> /home/mpiteam/openmpi/nightly-tarball-build-root/trunk/create-r18551/ompi/ope
>> nmpi-1.3a1r18551/_inst/etc/openmpi-mca-params.conf
>> /usr/bin/install: cannot create regular file
>> `/home/mpiteam/openmpi/nightly-tarball-build-root/trunk/create-r18551/ompi/op
>> enmpi-1.3a1r18551/_inst/etc/openmpi-mca-params.conf': No such file or
>> directory
>> make[4]: *** [install-data-local] Error 1
>> make[4]: *** Waiting for unfinished jobs....
>> make[4]: *** Waiting for unfinished jobs....
>> make[4]: Leaving directory
>> `/home/mpiteam/openmpi/nightly-tarball-build-root/trunk/create-r18551/ompi/op
>> enmpi-1.3a1r18551/_build/opal/etc'
>> make[3]: *** [install-am] Error 2
>> make[3]: Leaving directory
>> `/home/mpiteam/openmpi/nightly-tarball-build-root/trunk/create-r18551/ompi/op
>> enmpi-1.3a1r18551/_build/opal/etc'
>> make[2]: *** [install-recursive] Error 1
>
> Nice clue, thanks. This is a bug in opal/etc/Makefile.am:
>
> --- quote opal/etc/Makefile.am ---
> # This has to be here, even though it's empty, so that AM thinks that
> # "something" will happen here (details fuzzy, but we remember that this
> # *needs* to be here -- you have been warned).
>
> sysconf_DATA =
>
> # Steal a little trickery from a generated Makefile to only install
> # files if they do not already exist at the target.
>
> install-data-local:
> @ p="$(opal_config_files)"; \
> for file in $$p; do \
> if test -f $(DESTDIR)$(sysconfdir)/$$file; then \
> echo "******************************* WARNING
> ************************************"; \
> echo "*** Not installing new $$file over existing file in:"; \
> echo "*** $(DESTDIR)$(sysconfdir)/$$file"; \
> echo "******************************* WARNING
> ************************************"; \
> else \
> if test -f "$$file"; then d=; else d="$(srcdir)/"; fi; \
> f="`echo $$file | sed -e 's|^.*/||'`"; \
> echo " $(INSTALL_DATA) $$d$$file $(DESTDIR)$(sysconfdir)/$$f"; \
> $(INSTALL_DATA) $$d$$file $(DESTDIR)$(sysconfdir)/$$f; \
> fi; \
> done
> --- snip ---
>
> To clarify the mysterious comment above, the "sysconf_DATA =" line
> causes automake to emit an undocumented target install-sysconfDATA which
> effectively runs something like
> mkdir -p $(DESTDIR)$(sysconfdir)
>
> and then installs zero files there. The install-data-local rule is also
> updated as a dependency of 'install', just like install-sysconfDATA,
> however there exists no dependency relation between the two. Which
> means that with parallel make, they can be run concurrently, which I
> assume is what happened in your case; although the log shows them in the
> right order, it can still happen that mkdir wasn't done with its work
> before install-data-local accessed the directory.
>
> An easy fix is to use install-data-hook instead, which is documented to
> run after the normal install rules; or to generate the directory in the
> install-data-local rule itself, and drop the sysconf_DATA line.
>
> Proposed, untested patch below.
>
> I have not checked whether there are more instances of this in OMPI.
>
> Cheers,
> Ralf
>
> Fix race condition in 'make install': let install-data-local
> create $(sysconfdir), rather than an automake-generated rule
> which may be run in parallel (with make -j).
>
> Index: opal/etc/Makefile.am
> ===================================================================
> --- opal/etc/Makefile.am (Revision 17766)
> +++ opal/etc/Makefile.am (Arbeitskopie)
> @@ -23,16 +23,11 @@
>
> EXTRA_DIST = $(opal_config_files)
>
> -# This has to be here, even though it's empty, so that AM thinks that
> -# "something" will happen here (details fuzzy, but we remember that this
> -# *needs* to be here -- you have been warned).
> -
> -sysconf_DATA =
> -
> # Steal a little trickery from a generated Makefile to only install
> # files if they do not already exist at the target.
>
> install-data-local:
> + $(mkdir_p) $(DESTDIR)$(sysconfdir)
> @ p="$(opal_config_files)"; \
> for file in $$p; do \
> if test -f $(DESTDIR)$(sysconfdir)/$$file; then \
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel