Table Of Contents
- Introduction
- Project Overview
- Working Features
- What Went Wrong
- Still Not Working
- What I’d Like to Add Next
- What I Learned
- Closing
- References
Introduction
This shell is actually my first project after learning C programming. I’ve always been obsessed with how system calls work. I wanted to dive deep into kernel development, but I realized it would be a real pain if I didn’t understand system calls and signals first. So my next option was to build a mini-shell—and let me tell you, implementing some parts was much harder than I thought (╥‸╥). In the end, though, I had a great time making this messy shell (˶˃ ᵕ ˂˶).
This post covers the features I implemented, the failures I encountered, and how the architecture of the shell works. It also explains some parts that were really hard for me to grasp at first. Overall, this is what I’ve learned so far. (You can check out my GitHub to see the full code.)
Project Overview
-
What it does now:
- Runs external commands like
ls
,echo
,grep
, … - Runs built-in commands (which don’t require
fork
), such ascd
,pwd
, … - Supports job control (so you can use
fg
,bg
,jobs
) - Handles pipelines (e.g.,
ls | grep foo
)
- Runs external commands like
-
Technologies used:
- Linux system calls:
fork
,exec
,pipe
,signal
- Debugging:
gdb
,valgrind
- Build automation:
Makefile
- Linux system calls:
Below are a few GIFs/screenshots illustrating key features:
Working Features
These features work as intended:
-
Tokenizer
- Handles both single quotes (
'
) and double quotes ("
) - Double quotes honor escape characters (e.g.,
"
,'
)
- Handles both single quotes (
-
Expander
- Expands
$VARIABLE
,$?
, and$$
in both external and built-in commands
- Expands
-
External commands
- Executes commands like
ls
,echo
,grep
, …
- Executes commands like
-
Built-in commands
-
cd
-
help
-
exit
-
export
-
unset
-
pwd
-
bg
-
fg
jobs
-
Error handling for all phases
Simple Redirections with
>
,<
,>>
-
Job control
- Supports background and foreground jobs
Pipelines
What Went Wrong
To be honest? So many things went wrong. For example, I spent two days on pipes just to learn that you need to close pipe ends in the parent as well as in the children. Below are the concepts that were hardest for me, and where I wasted the most time.
Race Condition in setpgid
I had the idea (from blog posts and books) that the first child in a pipeline should become the process‐group leader, and that its PID would be the PGID
for the rest of the pipeline. The problem was that the other children don’t automatically know the leader’s PID after forking.One dumb idea I had was to use pipes to synchronize parent and children. but that extra complexity wasn’t necessary.
In fact, you can just call setpgid
in the parent immediately after forking each child. (It took me a while to finally find out that honestly.) Since children inherit the updated PGID
from the parent’s variable, they automatically join the correct group. I eventually ended up with code like this:
for (proc = job->first_process, proc_num = 0; proc; proc = proc->next, proc_num++) {
cmd = proc->cmd;
pid_t pid = fork();
if (pid < 0) {
perror("fork failed");
// handle error…
return -1;
}
if (pid == 0) {
// In child:
if (setpgid(0, pgid) < 0) {
perror("child: setpgid failed");
exit(EXIT_FAILURE);
}
// ... exec_command(cmd) ...
exec_command(cmd);
perror("execve failed");
exit(EXIT_FAILURE);
}
// In parent:
if (proc_num == 0) {
pgid = pid;
job->pgid = pgid;
}
proc->pid = pid;
if (setpgid(pid, pgid) < 0 && errno != EACCES && errno != EINVAL) {
perror("parent: setpgid failed");
}
// ...
}
So I understood that we should call setpgid
for the first child and all subsequent children in the parent immediately, so each child inherits the correct PGID
without extra synchronization.
Using tcsetpgrp
to Pass Control of the Terminal
I was confused because the “GNU documentation” suggested that you should call tcsetpgrp
in child too (maybe I misunderstood it? ). After digging around (and asking ChatGPT), I learned that’s outdated: you only need to call tcsetpgrp
in the parent to give the terminal to the child’s process group. Calling it in each child is unnecessary and could even lead to race conditions.
Deadlock in Pipes
It took me way too long to realize that closing unused pipe ends is not optional. If a process doesn’t close the write‐end when it’s done writing, the reading side never sees EOF and hangs forever. Likewise, if the writer doesn’t close its read end, it can still write even when no one is reading. Eventually, the pipe buffer fills up and the writer blocks indefinitely.
I was also surprised that children inherit open file descriptors from the parent. That means you have to close unused pipe ends both in the parent and in each child.
These are the steps I took to implement pipeline feature (Based on the references in Reference section):
- first we need to know how many pipes we need. If we have N processes then we need N - 1 pipes to implement the communications between the processes.
- All processes cannot write to and read from the same pipe, because there would be a race situation and only one process would be able to write or read.
- the pipes should be created in the parent process, because if we create them in children after
fork
, then after the child exits the pipe is gone too. also based on the design of pipes, the processes communicating over pipes should have a common ancestor, which in our case is the parent.- every child inherits the open file descriptors from its parent. and after creating pipes the file descriptors are open so the parent and the children MUST close the unused fds.
- In each child we need to
dup2
the pipe's fds and theSTDOUT
andSTDIN
file descriptors.- then the children need to close the unused file descriptors.
Example outline in C:
/* IN CHILD */
// FOR STDIN
if (proc_num > 0) {
dup2(pipes[proc_num - 1][0], STDIN_FILENO);
close(pipes[proc_num - 1][0]);
}
// FOR STDOUT
if (proc_num < num_procs - 1) {
dup2(pipes[proc_num][1], STDOUT_FILENO);
close(pipes[proc_num][1]);
}
// CLOSE ALL PIPE ENDS
for (int i = 0; i < num_procs - 1; i++) {
close(pipes[i][0]);
close(pipes[i][1]);
}
/* IN PARENT */
// CLOSE THE PIPES THE PARENT DON'T NEED
if (proc_num > 0)
close(pipes[proc_num - 1][0]);
if (proc_num < num_procs - 1)
close(pipes[proc_num][1]);
/* AFTER FINISHING ALL FORK */
for (int j = 0; j < num_procs - 1; j++) {
close(pipes[j][0]);
close(pipes[j][1]);
}
This is from The Linux Programming Interface:
Also, it was surprising to me that child processes inherit open pipe ends from the parent, so the parent also needs to close the unused pipe ends.
Why Concurrency Is Key in Pipes
At first, I thought: “Why can’t the parent just wait()
for each child in sequence?” I considered two scenarios:
A. When data < PIPE_BUF
capacity
- If you’re lucky, the pipe never fills, so the child writing to the pipe finishes quickly, then the next child reads, etc. It “works,” but it’s slow—processes aren’t running concurrently, so performance sucks.
B. When data ≥ PIPE_BUF
capacity
- The writer writes until the pipe buffer is full. Since no reader is consuming yet (because the parent is busy waiting on the previous child), the writer blocks permanently. Deadlock. If the reader ever closes the read end, the writer gets
SIGPIPE
and crashes.
So the whole point of pipes is that the data flows between processes. as soon as the data is available the other process must try to read it. the write end should not be blocked for a long time.
Signal Handling
In my opinion, signal handling is the hardest part of building a shell—especially if you’re new to how they work.
Blocking signals in the parent while setting up the process group: When you fork children, you don’t want terminal‐generated signals (like
SIGINT
from Ctrl+C) to hit the shell (parent) before it gives control of the terminal to the child’s process group. So you must block signals until each child’s PGID is set and terminal control is passed.SIGCHLD
: You need to blockSIGCHLD
in the parent until you finish setting each child’s process group. Imagine a child executes and exits so fast that the parent hasn’t yet calledsetpgid(child_pid, child_pid)
. So when we have aSIGCHLD
handler and this handler reaps children, then the child that has finished its execution gets reaped and shell continues and tries tosetpgid(pid, pid)
. This fails because there is not such child, so parent fails withESRCH
(no such child).. So blockSIGCHLD
until aftersetpgid
.
Still Not Working
-
Background job notifications sometimes fail
Occasionally, background jobs finish but don’t send a notification. The shell only updates job status when a prompt is printed or when you run
jobs
. Yet sometimes jobs finish and never show up injobs
. I still haven’t tracked down the exact cause. (Any tips here are greatly appreciated.)
What I’d Like to Add Next
These are features I plan to implement:
- Command history (so users can press ↑/↓ to cycle through past commands)
- Tab completion
-
Heredoc support (e.g.,
cat <<EOF … EOF
)
I’m also working on unit and integration tests. Additionally, I’d like to research optimized data structures for job control instead of my current linked lists + hash tables.
What I Learned
Building a mini-shell feels like a “basic” exercise, but for me it was a real challenge. Especially since I was new to system calls and still learning C (unfortunately, I’m still nowhere near an expert (─ ‿ ─) ).
- Tokenization is much harder than I thought. There are so many edge cases: quoted strings, escaped characters, variables expansion. Halfway through, you realize there are dozens of cases you didn’t consider initially.
-
Signals are monsters. Getting signal blocking/unblocking,
SIGCHLD
, andSIGINT
/SIGTSTP
behavior right is a nightmare to debug without experience. - Debugging in C without GDB and Valgrind is total pain. Whenever I tried to “printf” my way through pointers and memory errors, I wasted hours.
-
I didn’t know built-in commands shouldn’t create job entries. I only discovered that while implementing
jobs
—and yes, I used ChatGPT to figure it out. (I know I shouldn’t rely on AI, but I was completely stuck.)
Closing
If you’ve written a shell or worked with system-level C, I’d really appreciate your feedback on where my understanding is flawed or where I could’ve implemented things differently. Feel free to suggest resources or projects I should tackle next because I’m still learning ( ◡̀_◡́)ᕤ, and every bit of advice helps.
(I update this post when I implement something new.)
References
I used a lot of resources—blogs, Reddit posts, Stack Overflow threads (some links I lost track of), but these were my main references:
- The Linux Programming Interface by Michael Kerrisk
- Advanced Programming in the UNIX Environment (2nd ed.) by W. Richard Stevens & Stephen A. Rago
- Operating Systems: Three Easy Pieces by Remzi H. Arpaci-Dusseau & Andrea C. Arpaci-Dusseau
- The GNU C Library Reference Manual
Some Reddit/Stack Overflow threads I saved:
Top comments (0)
Some comments may only be visible to logged-in visitors. Sign in to view all comments.