1.0 Introduction: Beyond Manual Filenaming
For most developers, the command line is a second home, and the asterisk (*) is a familiar tool for wrangling files. But to see * as the beginning and end of shell pattern matching is like knowing only one chord on a guitar. The world of shell pattern matching, formally known as "globbing," is far richer, more nuanced, and dramatically more powerful. Mastering its patterns is a crucial step in leveling up your command-line efficiency, transforming tedious, multi-step file operations into concise, single-line commands. The term "globbing" itself is a relic from early Unix, where an external /etc/glob program handled this "global" expansion of wildcards, a name derived from poker where a wildcard can represent any other card.
It is critical, however, to draw a clear distinction at the outset. This article is about shell globbing, a process handled by the shell (like Bash or Zsh) to match and expand filenames based on wildcard patterns. This is fundamentally different from regular expressions (regex), which are used by utilities like grep or sed to match patterns inside the text content of files. While they share some symbols, their purpose and behavior are entirely distinct. This article focuses exclusively on globbing.
We will begin with the foundational wildcards supported by nearly every shell, explore the subtle but important mechanism of brace expansion, and then unlock the advanced logic of extended and recursive globbing. Finally, we will cover critical best practices to ensure you wield these powerful tools safely and effectively. Let's move beyond the basics and unlock true command-line fluency.
2.0 The Foundations: Standard POSIX Wildcards
Standard wildcards are the universal language of file matching, supported by virtually every Unix-like shell, including Bash. They form the bedrock of command-line interaction, providing a simple yet effective syntax for selecting groups of files. Understanding these three core operators is the first and most important step.
The Asterisk (*): Matching Zero or More Characters. The asterisk is the most widely used wildcard. It matches any sequence of characters, including no characters at all. This makes it incredibly versatile for matching files based on prefixes, suffixes, or substrings.
Example: Find all log files
# Lists all files ending with the .log extension
ls *.log
Example: Remove all temporary text files
# Removes all files that start with 'a' and end with '.txt'
rm a*.txt
The Question Mark (?): Matching Exactly One Character. Where the asterisk is flexible, the question mark is precise. It matches exactly one of any character. This is invaluable when you need to select files with fixed-length variations in their names.
Example: Match single-digit report files
# This will match part_1.log, part_2.log, etc.
# but it will NOT match part_10.log because '10' is two characters.
ls part_?.log
Example: Match files with a specific naming pattern
# This would match files like 'list.sh' but not 'lost.sh'
ls l?st.sh
Square Brackets ([...]): The Character Class. The character class, or bracket expression, matches a single character from a specifically defined set. This allows for more granular control than the question mark by specifying which characters are valid for that single position.
Specific Characters: You can list individual characters to match.
Character Ranges: A hyphen can define a range of characters, such as numbers or letters. (See Section 7.0 for an important warning about locale-related traps with letter ranges).
Negation: Placing a ! (the POSIX standard) or ^ (common in Bash) as the first character inside the brackets inverts the match, selecting any single character except those in the set. For portability, ! is preferred.
These foundational wildcards are powerful but struggle with OR-style logic. How would you list all JPEG and PNG files in one command? While you could run two separate commands, this is inefficient. This limitation leads us to mechanisms for generating multiple patterns, starting with brace expansion.
3.0 String Generation vs. File Matching: Understanding Brace Expansion
A common source of confusion for developers is the difference between brace expansion ({...}) and globbing (*, ?, []). It's a critical distinction: brace expansion generates strings, while globbing matches existing files. The shell performs brace expansion first, creating a list of strings before any other interpretation, and it does so without checking if corresponding files actually exist. Understanding this order of operations—generation before matching—is one of the key distinctions between a novice and an expert shell user.
Let's look at a clear, comparative example to illustrate this sequence.
Generation with Brace Expansion:
Before the touch command is executed, the shell sees file{1,2,3}.txt and expands it into three separate arguments: touch file1.txt file2.txt file3.txt. The touch command then runs with this expanded list, creating three new files.
Matching with Globbing:
Here, the shell sees the * globbing character. It scans the current directory for any files that match the pattern file*.txt. It finds the three files we just created and passes that list (file1.txt file2.txt file3.txt) to the ls command.
Because brace expansion generates a list of patterns, it can be used to create an OR-like effect when combined with other wildcards.
# List files containing either the substring 'alpha' or 'sigma'
ls *{alpha,sigma}*
In this case, the shell first expands {alpha,sigma} into two separate patterns: alpha and sigma. It then performs globbing on both patterns, effectively listing files that match either condition. While this is a useful technique for generating multiple patterns, shells like Bash offer a more powerful, integrated logic for matching complex patterns directly.
4.0 Unleashing Advanced Logic: Extended Globbing (extglob)
For more complex pattern matching that approaches the logic of regular expressions, Bash provides a feature called "extended globbing" or extglob. This powerful option is disabled by default but can be easily enabled to unlock a new set of operators for sophisticated file selection.
To enable extended globbing for your current shell session, use the shopt (shell options) command:
shopt -s extglob
Once enabled, you can use several new pattern operators. To disable it, you can run:
shopt -u extglob
The five primary extglob operators provide powerful logical constructs for pattern matching:
The negation operator, !(...), is particularly useful for system administration and cleanup tasks. Imagine you want to clean a project directory but preserve your source code and documentation. Extended globbing makes this a trivial one-liner:
# Make sure extglob is enabled first!
# shopt -s extglob
# First, preview what will be deleted
echo !(*.py|*.md)
# If the list is correct, run the rm command
rm !(*.py|*.md)
Extended globbing gives us regex-like power over the files in a single directory. But modern development is rarely confined to one folder. The next logical step is to project that power recursively through an entire project tree, a task that traditionally required leaving the shell's native syntax for the find command.
5.0 Traversing Directories: Recursive Globbing with globstar
Historically, finding files across a complex directory tree was the exclusive domain of the find command. While powerful, find has a distinct syntax that can feel cumbersome for simple recursive searches. The modern, shell-native solution is the ** pattern, widely known as globstar.
The journey of ** is a great example of shell evolution. The Z shell (Zsh) pioneered recursive globbing in the early 1990s. Independently, KornShell (ksh93) developed its own version around 2003 and coined the term "globstar." Finally, the widely used Bash shell adopted the feature in version 4.0 (released in 2009), borrowing the name from ksh.
To use it in Bash, you must first enable it with shopt:
shopt -s globstar
Once enabled, ** acts as a wildcard for "zero or more directories." It allows you to search the current directory and all subdirectories, no matter how deeply nested, with a single, elegant pattern.
Example: Find all text files recursively
# Lists every file with a .txt extension in the current directory and all subdirectories.
ls **/*.txt
Example: A Zsh example showing advanced combination This Zsh command demonstrates how globstar can be combined with other advanced features, like glob qualifiers (covered next), to perform incredibly powerful operations.
# Recursively deletes all files greater than 1 Gigabyte.
# This is a Zsh-specific syntax.
rm **/*(LG+1)
As an expert aside, it's worth noting that early versions of Bash's globstar would follow symbolic links by default, which could lead to infinite loops in certain directory structures. This behavior has been refined in newer versions to be safer and more predictable, generally not following symlinks unless explicitly targeted.
With globstar, the shell provides a clean and powerful alternative to find for many common tasks. But for the ultimate in file filtering, some modern shells go even further.
6.0 Pro-Level Filtering: A Glimpse at Zsh Glob Qualifiers
While Bash's extglob and globstar features are immensely powerful, the Z shell (Zsh) takes pattern matching to an entirely new level with glob qualifiers. These are special modifiers, appended to a glob pattern in parentheses, that allow you to filter files based on their metadata—such as type, size, modification time, and permissions—not just their names. This transforms globbing from a name-matching tool into a full-fledged file query language.
Here is a look at just a few of the powerful filtering capabilities Zsh's glob qualifiers provide:
Filter by file type:
Filter by file size:
Filter by modification time:
Sort results and limit the output:
This isn't just about Zsh; it's about a mindset. The shell isn't just a tool to run commands—it's a programmable environment. Adopting more powerful tools like Zsh can fundamentally change your relationship with the command line. Now, let's return to the practices that apply to all shells.
7.0 Critical Nuances and Best Practices for Safe Globbing
The power of globbing brings with it the risk of unintended and often destructive consequences. A single misplaced asterisk in an rm command can wipe out a project. Therefore, mastering wildcards requires not just knowing the syntax, but also adhering to a strict set of practices to ensure you are using them safely and effectively.
1. Always Preview Destructive Commands
This is the golden rule of safe globbing. Before you run a command that modifies or deletes files (rm, mv, cp), first use a harmless command like echo or ls -d with the exact same pattern. This allows you to see a list of what the glob will expand to, ensuring it matches only the files you intend to target.
# Step 1: Preview which files will be matched
echo *.tmp
# Step 2: If the list is correct, use the up-arrow to recall the command
# and replace 'echo' with 'rm'.
rm *.tmp
2. Quoting is Not Optional
A common and subtle bug occurs when using wildcards as arguments to commands like find or grep. If the wildcard is not quoted, the shell will try to expand it before the command ever sees it. This leads to incorrect behavior or errors.
# BAD: The shell expands *go* to matching filenames in the current directory
# before find runs. This is not what you want.
find . -name *go*
# GOOD: The quotes protect the pattern, passing the literal string '*go*'
# to the find command, which then uses it for its own matching logic.
find . -name '*go*'
3. Handling Hidden "Dotfiles"
By default, standard glob patterns like * ignore files and directories that start with a period (.), such as .bashrc or .git. To include these "dotfiles" in your glob expansion, you can enable the dotglob shell option.
# Enable matching of dotfiles
shopt -s dotglob
A more refined alternative is to set the GLOBIGNORE variable. A key piece of expert knowledge is that setting GLOBIGNORE to any non-null value automatically enables the dotglob shell option. For example, GLOBIGNORE=".:.." enables dotfile matching but ensures that the special directories . and .. are always excluded, which is generally safer.
4. The nullglob and failglob Options
By default, if a glob pattern matches no files, Bash passes the literal pattern string as an argument to the command. The command (e.g., ls) then fails, complaining that it can't find a file with the literal name *.nonexistent. This can cause scripts to behave unexpectedly. Two shell options provide better control:
shopt -s nullglob
If a pattern matches nothing, it expands to an empty string (nothing). Pro-Tip: Using nullglob is a best practice inside shell script loops (for f in *.zip) to prevent the loop from running a single time with the literal string *.zip if no files are found.
shopt -s failglob
If a pattern matches nothing, the shell raises an error and the command does not execute. This is useful for script validation, as it immediately halts execution on a failed match.
5. The Locale "Collation" Trap
A dangerous and often overlooked issue is that character ranges can behave unpredictably depending on system language settings (locale). In many modern locales, the character sorting order (collation) is not strictly alphabetical but rather dictionary-style (e.g., a, A, b, B, ...). This means a range like [a-z] might unexpectedly match uppercase letters. For portable and predictable scripts, always use POSIX character classes.
Using these classes ensures your patterns behave consistently across different systems and environments.
8.0 Summary: Wildcards vs. Regular Expressions
Throughout this guide, we've emphasized the distinction between globbing and regular expressions. It is a fundamental concept that, once understood, clarifies the role of pattern matching across the entire Linux command-line ecosystem. They are different tools for different jobs.
This table provides a clear, scannable comparison of their core differences:
¹By default, a . at the start of a filename is not matched by * or ? unless the dotglob shell option is enabled.
The simplest way to remember the difference is with this rule of thumb: Use globbing to find your files, use regex to search inside your files.
9.0 Conclusion
We have journeyed from the simple asterisk to the sophisticated, metadata-aware queries of Zsh. You now have a comprehensive understanding of shell globbing, from the universal POSIX wildcards (*, ?, []) to the powerful logic of extended globbing (extglob) and the directory-traversing convenience of recursive globbing (globstar). Most importantly, you are equipped with the critical best practices needed to use these tools safely, preventing catastrophic errors while maximizing your efficiency.
The command line is an environment that rewards expertise. By moving beyond a superficial use of * and deliberately integrating these more advanced patterns into your daily work, you will write cleaner scripts, perform complex file operations with ease, and ultimately become a more proficient and productive developer. The power is there for the taking; it's time to start using it.
Top comments (0)