I use grep every day but had no idea how it works. So I built a basic version in Go.
Not to replace grep. Just to stop being the guy who pipes to grep without understanding what's actually happening.
What I Built
A stripped-down grep that does pattern matching with these flags:
-
-i- Case-insensitive -
[](url)-n- Line numbers -
-c- Count matches -
-v- Invert (show non-matches) -
-r- Recursive search -
-l- Just list filenames
Plus binary file detection and color output. That's it. No context lines, no fancy regex modes, no performance optimizations.
The Interesting Parts
Binary Files Don't Print Garbage
When grep hits a binary file (executable, image, whatever), it says "Binary file matches" instead of filling your terminal with nonsense.
How does it know?
func IsBinary(filePath string) (bool, error) {
f, err := os.Open(filePath)
if err != nil {
return false, err
}
defer f.Close()
buffer := make([]byte, 1024)
n, err := f.Read(buffer)
if err != nil && err != io.EOF {
return false, err
}
if bytes.IndexByte(buffer[:n], 0) != -1 {
return true, nil
}
return false, nil
}
Read 1KB. If there's a null byte (\0), it's binary. Text files don't have null bytes.
Simple check. Works.
Recursive Search Without Exploding
The -r flag searches directories. This means reading entries, checking if they're files or directories, and recursing when needed.
The trick: Don't stop on errors. One locked file shouldn't kill your entire search.
if info.IsDir() {
if !opts.Recursive {
return 0, fmt.Errorf("%s is a directory", fileName)
}
entries, err := os.ReadDir(fileName)
if err != nil {
return 0, fmt.Errorf("error reading directory: %v", err)
}
total := 0
for _, entry := range entries {
path := filepath.Join(fileName, entry.Name())
subCount, err := grepFile(pattern, path, opts)
if err != nil {
fmt.Fprintf(os.Stderr, "warning: %v\n", err)
continue // Keep going
}
total += subCount
}
return total, nil
}
Log it, move on. Real grep does this. So should you.
Flags Interact in Weird Ways
Case-insensitive search: Modify the pattern before compiling the regex.
if opts.CaseInsensitive {
pattern = "(?i)" + pattern
}
re, err := regexp.Compile(pattern)
List files only: Stop after the first match.
if matched {
count++
if opts.ListFilesOnly {
fmt.Printf("%s:\n", fileName)
return count, nil // Done
}
}
Invert match: Flip the boolean.
matched := re.MatchString(line)
if opts.Invert {
matched = !matched
}
Getting these combinations right took a few tries. -r -l should recurse and list filenames. -v -i should invert case-insensitive matches. Test all the combinations.
Color Output
Grep highlights matches in red. To do this:
- Find all regex matches in the line
- Replace each match with a colored version
- Print the result
matchColor := color.New(color.FgRed).SprintFunc()
coloredLine := re.ReplaceAllStringFunc(line, func(m string) string {
return matchColor(m)
})
ReplaceAllStringFunc walks through matches and applies your function. Easy.
What I Actually Learned
Null bytes are the binary file signal. One check, problem solved.
Compile regex once, use it everywhere. Compiling per-line is stupid slow. Compile once at the start.
Scanner vs ReadFile depends on the file type. Text files get bufio.Scanner for line-by-line reading. Binary files get ReadFile to check the whole thing. Different tools for different jobs.
Recursive operations need error tolerance. One bad file can't crash everything. Log it, continue.
Manual flag parsing sucks. Next time I'm using a library. Checking string prefixes gets old fast.
What's Missing
Real grep has a lot more:
- Context lines (
-A,-B,-C) - Extended regex (
-E) - Fixed string search (
-F) - Parallel search
- Memory-mapped files for huge files
- Proper CLI argument parsing
Mine doesn't. It does basic pattern matching. That's the point - understand the core, skip the extras.
Try It
git clone https://github.com/codetesla51/go-coreutils.git
cd go-coreutils/grep
go build -o grep grep.go
./grep "pattern" file.txt
Basic usage:
./grep "error" logs.txt
./grep -i -n "warning" logs.txt
./grep -r "TODO" ./src
./grep -c "func" main.go
Why This Matters
Grep is everywhere. Understanding how it works changes how you use it. You stop thinking "magic search command" and start thinking "regex matcher with file handling."
Plus, now when someone asks "how does grep detect binary files?" you actually know.
Source: github.com/codetesla51/go-coreutils
More at devuthman.vercel.app
Top comments (0)