Turning Source Code into a Program
Before getting straight into Makefiles, lets briefly cover how source code gets turned into an actual program that can run on a computer. Source code consists of a set of files and folders that contain code. This source code usually needs to be converted into a form that the computer can understand. This process is called compilation or compiling. A program that performs this conversion is called a compiler.
Sometimes the compiler needs to be given certain pieces of information so it can properly do its job. This information may include:
- The names and locations of the source code (input) files to compile
- The set of compiled (output) programs to create
- The names and locations to put the compiled (output) programs
- Whether or not to apply any special options in the compilation process
The process of choosing a compiler, identifying the set of source code files to be included, performing preperation steps, and compiling the code into its final form is called building, or the build process.
What is Make?
Make is a build automation tool. It would be very tedious for a developer to manually run all of the build steps in sequence each time they want to build their program. Build automation tools like Make allow developers to describe the build steps and execute them all at once.
What is a Makefile?
Makefiles are text files that developers use to describe the build process for their programs. The make
command can then be used to conveniently run the instructions in the Makefile.
Baby Git Makefile
Below is the original Makefile for Git. It is used to invoke the gcc C compiler to build binary executable
files for each of the original 7 git commands:
- init-db
- update-cache
- cat-file
- show-diff
- write-tree
- read-tree
- commit-tree
This Makefile can be invoked in 3 variations (referred to as 3 targets), by running the 3 following commands from the command line inside the same directory as the Makefile:
- make clean: This removes all previously built executables and build files from the working directory.
-
make backup: This first runs
make clean
and then backs up the current directory into a tar archive. - make: This builds the codebase and creates the 7 git executables.
Enough talk - here is the code from Git's first Makefile:
CFLAGS=-g # The `-g` compiler flag tells gcc to add debug symbols to the executable for use with a debugger.
CC=gcc # Use the `gcc` C compiler.
# Specify the names of all executables to make.
PROG=update-cache show-diff init-db write-tree read-tree commit-tree cat-file
all: $(PROG)
install: $(PROG)
install $(PROG) $(HOME)/bin/
# Include the following dependencies in the build.
LIBS= -lssl
# Specify which compiled output (.o files) to use for each executable.
init-db: init-db.o
update-cache: update-cache.o read-cache.o
$(CC) $(CFLAGS) -o update-cache update-cache.o read-cache.o $(LIBS)
show-diff: show-diff.o read-cache.o
$(CC) $(CFLAGS) -o show-diff show-diff.o read-cache.o $(LIBS)
write-tree: write-tree.o read-cache.o
$(CC) $(CFLAGS) -o write-tree write-tree.o read-cache.o $(LIBS)
read-tree: read-tree.o read-cache.o
$(CC) $(CFLAGS) -o read-tree read-tree.o read-cache.o $(LIBS)
commit-tree: commit-tree.o read-cache.o
$(CC) $(CFLAGS) -o commit-tree commit-tree.o read-cache.o $(LIBS)
cat-file: cat-file.o read-cache.o
$(CC) $(CFLAGS) -o cat-file cat-file.o read-cache.o $(LIBS)
# Specify which C header files to include in compilation/linking.
read-cache.o: cache.h
show-diff.o: cache.h
# Define the steps to run during the `make clean` command.
clean:
rm -f *.o $(PROG) temp_git_file_* # Remove these files from the current directory.
# Define the steps to run during the `make backup` command.
backup: clean
cd .. ; tar czvf babygit.tar.gz baby-git # Backup the current directory into a tar archive.
Build Variables
Build variables are variables than can be defined in the Makefile to hold specific values. In the Makefile above, words such as CFLAGS
and CC
are not special in any way. They are just variable names used to store the values that come after the equals sign. Variable names like $(CFLAGS)
can be used later in the Makefile to substitute in the variable values where needed. This is convenient since we can use a variable name in multiple places, while only updating it in one place if the value changes.
Specifying the Compiler
Git is written in C, so this Makefile is tailored to a C build process.
The first line CFLAGS=-g
specifies the compiler flags - special compiler options - to use during compilation. In this case, the -g
flag tells the compiler to output debug information to the console.
The second line CC=gcc
identifies the actual compiler to use. GCC is the GNU Compiler Collection. It supports compilation of code in several programming languages including C, C++, Java, and more.
Specifying the Executables
The third line defines a build variable called PROG
which contains the names of the executables we'll be creating.
Linking External Libraries
We'll quickly skip ahead to the line which defines the LIBS
variable. This stores the external libraries that we want to link into the build process. In this case, we link in the SSL library which allows Git to access cryptographic functions like hashing.
Make Targets and Commands
Throughout the Makefile, there are multiple lines that start with a keyword followed by a colon such as all:
, install:
, init-db:
, etc. Each of these is called a target. Each target essentially maps to a command that you can specify when running Make, in the form make target
.
For example, if you open a terminal window and browse to this Makefile's directory, you could run the make all
command to run Make on the all
target. Similarly you could run make install
to run Make on the install
target. If no target is specified, the all
target will be used by default.
When Make runs a target, it executes the instructions associated with that target in the Makefile.
The All Target
Back to the Makefile, the all: $(PROG)
line states that, when Make is run without specifying a target, all targets listed in $(PROG) will be executed. Since $(PROG)
lists all 7 of the Baby Git executables, each of them will be executed.
The Install Target
The next target in the Makefile is install
. It is run at the command line using make install
. This starts the same way as the all
target, by specifying the executables to compile using $(PROG)
. But then it uses the install
command to move those built executables into the users home directory.
Baby Git Program Targets
Now for the targets corresponding to the executable names:
init-db:
update-cache:
show-diff:
write-tree:
read-tree:
commit-tree:
cat-file:
Each one of these targets specifies which compiled C object (.o) files we want in each of our executables. Below that each one specifies the compiler command to run based no the build variables specified earlier in the file.
The first executable init-db is very simple since it only includes 1 source file: init-db: init-db.o
The other executables (we'll take update-cache as an example) link together multiple C object (.o) files:
update-cache: update-cache.o read-cache.o
$(CC) $(CFLAGS) -o update-cache update-cache.o read-cache.o $(LIBS)
The second line above gets converted to the following after variable substitution:
gcc -g -o update-cache update-cache.o read-cache.o -lssl
Linking Header Files
After the program targets, there are two lines that specify the C header (.h) files to link to each object (.o) file. The only header file in the Baby Git codebase is cache.h
which gets linked to read-cache.o
and show-diff.o
.
The clean Target
This target is invoked using make clean
and simply deletes all compiled code and executables from the working directory. It leaves the source files alone so that the program can be built again.
The backup Target
This target is invoked using make backup
. First it invokes the clean
target. Then it backs up the source code files in the working directory as a tar archive in the parent directory.
Conclusion
In this article we described how Git's first Makefile works line by line. We hope it helped you understand how Makefiles work and how they are implemented in practice.
Note: The original posting for this article can be found on the Initial Commit Blog
Top comments (0)