Why do we describe build procedures with Makefiles instead of shell scripts?

633 Views Asked by At

Remark This is a variation on the question “What is the purpose of linking object files separately in a Makefile?” by user4076675 taking a slightly different point of view. See also the corresponding META discussion.

Let us consider the classical case of a C project. The gcc compiler is able to compile and link programs in one step. We can then easily describe the build routine with a shell script:

case $1 in
    build)  gcc -o test *.c;;
    clean)  rm -f test;;
esac
# This script is intentionally very brittle, to keep
# the example simple.

However, it appears to be idiomatic to describe the build procedure with a Makefile, involving extra steps to compile each compilation unit to an object file and ultimately linking these files. The corresponding GNU Makefile would be:

.PHONY: all

SOURCES=$(wildcard *.cpp)
OBJECTS=$(SOURCES:.cpp=.o)

%.o: %.cpp
    g++ -c -o $@ $<

all: default
default: $(OBJECTS)
    g++ -o test $^

clean:
    rm -rf *.o

This second solution is arguable more involved than the simple shell script we wrote before. It as also a drawback, as it clutters the source directory with object files. So, why do we describe build procedures with Makefiles instead of shell scripts? At the hand of the previous example, it seems to be a useless complication.

1

There are 1 best solutions below

0
On

In the simple case where we compile and link three moderately sized files, any approach is likely to be equally satisfying. I will therefore consider the general case but many benefits of using Makefiles are only important on larger projects. Once we learned the best tool which allows us to master complicated cases, we want to use it in simple cases as well.

Let me highlight the ''benefits'' of using make instead of a simple shell script for compilation jobs. But first, I would like to make an innocuous observation.

The procedural paradigm of shell scripts is wrong for compilation-like jobs

Writing a Makefile is similar to writing a shell script with a slight change of perspective. In a shell script, we describe a procedural solution to a problem: we can start to describe the whole procedure in very abstract terms using undefined functions, and we refine this description until we reached the most elementary level of description, where a procedure is just a plain shell command. In a Makefile, we do not introduce any similar abstraction, but we focus on the files we want to produce and how we can produce them. This works well because in UNIX, everything is a file, therefore each treatment is accomplished by a program which reads its input data from input files, do some computation and write the results in some output files.

If we want to compute something complicated, we have to use a lot of input files which are treated by programs whose outputs are used as inputs to other programs, and so on until we have produced our final files containing our result. If we translate the plan to prepare our final file into a bunch of procedures in a shell script, then the current state of the processing is made implicit: the plan executor knows “where it is at” because it is executing a given procedure, which implicitly guarantees that such and such computations were already done, that is, that such and such intermediary files were already prepared. Now, which data describes “where the plan executor is at”?

Innocuous observation The data which describes “where the plan executor is at” is precisely the set of intermediary files which were already prepared, and this is exactly the data which is made explicit when we write Makefiles.

This innocuous observation is actually the conceptual difference between shell scripts and Makefiles which explains all the advantages of Makefiles over shell scripts in compilation jobs and similar jobs. Of course, to fully appreciate these advantages, we have to write correct Makefiles, which might be hard for beginners.

Make makes it easy to continue an interrupted task where it was at

When we describe a compilation job with a Makefile, we can easily interrupt it and resume it later. This is a consequence of the innocuous observation. A similar effect can only be achieved with considerable efforts in a shell script, while it is just built in make.

Make makes it easy to work with several builds of a project

You observed that Makefiles will clutter the source tree with object files. But Makefiles can actually be parametrised to store these object files in a dedicated directory. I work with BSD Owl macros for and use

MAKEOBJDIR='/usr/home/michael/obj${.CURDIR:S@^/usr/home/michael@@}'

so that all object files end under ~/obj and do not pollute my sources. See this answer for more details.

Advanced Makefiles allow us to have simultaneously several directories containing several builds of a project with distinct compilation options. For instance, with distinct features enabled, or debug versions, etc. This is also consequence of the innocuous observation that Makefiles are actually articulated around the set of intermediary files. This technique is illustrated in the testsuite of BSD Owl.

Make makes it easy to parallelise builds

We can easily build a program in parallel since this is a standard function of many versions of make. This is also consequence of the innocuous observation: because “where the plan executor is at” is an explicit data in a Makefile, it is possible for make to reason about it. Achieving a similar effect in a shell script would require a great effort.

The parallel mode of any version of make will only work correctly if the dependances are correctly specified. This might be quite complicated to achieve, but has the feature which literally anhilates the problem. It is called the META mode. It uses a first, non-parallel pass, of a compilation job to compute actual dependencies by monitoring file access, and uses this information in later parallel builds.

Makefiles are easily extensible

Because of the special perspective — that is, as another consequence of the innocuous observation — used to write Makefiles, we can easily extend them by hooking into all aspects of our build system.

For instance, if we decide that all our database I/O boilerplate code should be written by an automatic tool, we just have to write in the Makefile which files should the automatic tool use as inputs to write the boilerplate code. Nothing less, nothing more. And we can add this description pretty much where we like, make will get it anyway. Doing such an extension in a shell script build would be harder than necessary.

This extensibility ease is a great incentive for Makefile code reuse.