automake
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RFC: doc for `Handling Tools that Produce Many Outputs'


From: Alexandre Duret-Lutz
Subject: RFC: doc for `Handling Tools that Produce Many Outputs'
Date: Sat, 31 Jan 2004 23:28:29 +0100
User-agent: Gnus/5.1003 (Gnus v5.10.3) Emacs/21.3.50 (gnu/linux)

This is a new section I'd like to add to the FAQ.  It has been
discussed two or three times on the list.

I'm posting it here for comment.  (In fact I'm mainly hoping
that some kind fellow will point out English mistakes...)


Handling Tools that Produce Many Outputs
========================================

This section describes a `make' idiom that can be used when a tool
produces multiple outputs.  It is not specific to Automake and can be
used in ordinary `Makefile's.

   Suppose we have a program called `foo' that will read one file
called `data.foo' and produce two files called `data.c' and `data.h'.
We want to write a `Makefile' rule that captures this one-to-two
dependency.

   The naive rule is incorrect:

     # This is incorrect.
     data.c data.h: data.foo
             foo data.foo

What the above rule really says is that `data.c' and `data.h' each
depend on `data.foo', and can each be built by running `foo data.foo'.
In other words it is equivalent to:

     # We do not want this.
     data.c: data.foo
             foo data.foo
     data.h: data.foo
             foo data.foo

which means that `foo' can be run twice.  It will not _necessarily_ run
twice, because many `make' implementations will check for the second
file after the first one has been built and will therefore detect that
it already exists.  However it can run twice, and we should avoid that.
An easy way to trigger the problem is to run a parallel make; if
`data.c' and `data.h' are built in parallel, two `foo data.foo'
commands will run concurrently.

   Another idea is to write the following:

     # There is still a problem with this one.
     data.c: data.foo
             foo data.foo
     data.h: data.c

The idea is that `foo data.foo' is run only when `data.c' need to be
updated, but we further state that `data.h' depends upon `data.c'.
That way, if `data.h' is required and `data.foo' is out of data, the
dependency on `data.c' will trigger the build.

   This is almost perfect, but suppose we have built `data.h' and
`data.c', and then we erase `data.h'.  Then, running `make data.h' will
not rebuild `data.h'.  The above rules just state that `data.c' must be
up-to-date with respect to `data.foo', and this is the case.

   What we need is a rule that forces a rebuild when data.h is missing.

     data.c: data.foo
             foo data.foo
     data.h: data.c
             @if test -f $@; then :; else \
               rm -f data.c; \
               $(MAKE) $(AM_MAKEFLAGS) data.c; \
             fi

   The above scales easily to more outputs and more inputs.  For
instance if `foo' should read `data.bar' and will also produce `data.w'
and `data.x', we would write:

     data.c: data.foo data.bar
             foo data.foo data.bar
     data.h data.w data.x: data.c
             @if test -f $@; then :; else \
               rm -f data.c; \
               $(MAKE) $(AM_MAKEFLAGS) data.c; \
             fi

   One of the output (here `data.c') is used as a witness of the run of
`foo'.  The other files depend on that witness.  Ideally the witness
should have the oldest timestamp among the output files, so that the
second rule (`data.h data.w data.x: data.c') is not triggered
needlessly.  For this reason, it is often better to use another file
(not any output) as witness.

     data.stamp: data.foo data.bar
             @rm -f data.tmp
             @touch data.tmp
             foo data.foo data.bar
             @mv -f data.tmp $@
     data.h data.w data.x data.c: data.stamp
             @if test -f $@; then :; else \
               rm -f data.stamp; \
               $(MAKE) $(AM_MAKEFLAGS) data.stamp; \
             fi

   `data.tmp' is created before `foo' is run, so that is has a
timestamp older than output files output by `foo'.  It is then renamed
as `data.stamp' after `foo' has run, because we do not want to update
`data.stamp' if `foo' fails.

-- 
Alexandre Duret-Lutz





reply via email to

[Prev in Thread] Current Thread [Next in Thread]