Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI devel] parallel make install
From: Ralf Wildenhues (Ralf.Wildenhues_at_[hidden])
Date: 2008-06-03 17:53:31


Hi Jeff,

* Jeff Squyres wrote on Tue, Jun 03, 2008 at 11:11:32PM CEST:
> ERROR: Command returned a non-zero exist status
> make -j 4 distcheck
[...]
> Making install in etc
> make[3]: Entering directory `/home/mpiteam/openmpi/nightly-tarball-build-root/trunk/create-r18551/ompi/openmpi-1.3a1r18551/_build/opal/etc'
> make[4]: Entering directory `/home/mpiteam/openmpi/nightly-tarball-build-root/trunk/create-r18551/ompi/openmpi-1.3a1r18551/_build/opal/etc'
> test -z "/home/mpiteam/openmpi/nightly-tarball-build-root/trunk/create-r18551/ompi/openmpi-1.3a1r18551/_inst/etc" || /bin/mkdir -p "/home/mpiteam/openmpi/nightly-tarball-build-root/trunk/create-r18551/ompi/openmpi-1.3a1r18551/_inst/etc"
> /usr/bin/install -c -m 644 ../../../opal/etc/openmpi-mca-params.conf /home/mpiteam/openmpi/nightly-tarball-build-root/trunk/create-r18551/ompi/openmpi-1.3a1r18551/_inst/etc/openmpi-mca-params.conf
> /usr/bin/install: cannot create regular file `/home/mpiteam/openmpi/nightly-tarball-build-root/trunk/create-r18551/ompi/openmpi-1.3a1r18551/_inst/etc/openmpi-mca-params.conf': No such file or directory
> make[4]: *** [install-data-local] Error 1
> make[4]: *** Waiting for unfinished jobs....
> make[4]: *** Waiting for unfinished jobs....
> make[4]: Leaving directory `/home/mpiteam/openmpi/nightly-tarball-build-root/trunk/create-r18551/ompi/openmpi-1.3a1r18551/_build/opal/etc'
> make[3]: *** [install-am] Error 2
> make[3]: Leaving directory `/home/mpiteam/openmpi/nightly-tarball-build-root/trunk/create-r18551/ompi/openmpi-1.3a1r18551/_build/opal/etc'
> make[2]: *** [install-recursive] Error 1

Nice clue, thanks. This is a bug in opal/etc/Makefile.am:

--- quote opal/etc/Makefile.am ---
# This has to be here, even though it's empty, so that AM thinks that
# "something" will happen here (details fuzzy, but we remember that this
# *needs* to be here -- you have been warned).

sysconf_DATA =

# Steal a little trickery from a generated Makefile to only install
# files if they do not already exist at the target.

install-data-local:
        @ p="$(opal_config_files)"; \
        for file in $$p; do \
          if test -f $(DESTDIR)$(sysconfdir)/$$file; then \
            echo "******************************* WARNING ************************************"; \
            echo "*** Not installing new $$file over existing file in:"; \
            echo "*** $(DESTDIR)$(sysconfdir)/$$file"; \
            echo "******************************* WARNING ************************************"; \
          else \
            if test -f "$$file"; then d=; else d="$(srcdir)/"; fi; \
            f="`echo $$file | sed -e 's|^.*/||'`"; \
            echo " $(INSTALL_DATA) $$d$$file $(DESTDIR)$(sysconfdir)/$$f"; \
            $(INSTALL_DATA) $$d$$file $(DESTDIR)$(sysconfdir)/$$f; \
          fi; \
        done
--- snip ---

To clarify the mysterious comment above, the "sysconf_DATA =" line
causes automake to emit an undocumented target install-sysconfDATA which
effectively runs something like
  mkdir -p $(DESTDIR)$(sysconfdir)

and then installs zero files there. The install-data-local rule is also
updated as a dependency of 'install', just like install-sysconfDATA,
however there exists no dependency relation between the two. Which
means that with parallel make, they can be run concurrently, which I
assume is what happened in your case; although the log shows them in the
right order, it can still happen that mkdir wasn't done with its work
before install-data-local accessed the directory.

An easy fix is to use install-data-hook instead, which is documented to
run after the normal install rules; or to generate the directory in the
install-data-local rule itself, and drop the sysconf_DATA line.

Proposed, untested patch below.

I have not checked whether there are more instances of this in OMPI.

Cheers,
Ralf

Fix race condition in 'make install': let install-data-local
create $(sysconfdir), rather than an automake-generated rule
which may be run in parallel (with make -j).

Index: opal/etc/Makefile.am
===================================================================
--- opal/etc/Makefile.am (Revision 17766)
+++ opal/etc/Makefile.am (Arbeitskopie)
@@ -23,16 +23,11 @@
 
 EXTRA_DIST = $(opal_config_files)
 
-# This has to be here, even though it's empty, so that AM thinks that
-# "something" will happen here (details fuzzy, but we remember that this
-# *needs* to be here -- you have been warned).
-
-sysconf_DATA =
-
 # Steal a little trickery from a generated Makefile to only install
 # files if they do not already exist at the target.
 
 install-data-local:
+ $(mkdir_p) $(DESTDIR)$(sysconfdir)
         @ p="$(opal_config_files)"; \
         for file in $$p; do \
           if test -f $(DESTDIR)$(sysconfdir)/$$file; then \