Tiger Smith

Posted on Oct 30 • Edited on Nov 6 • Originally published at devresourcehub.com

Linux Text Processing: Master grep, awk, sed & jq for Developers

#linux

Learn how to use grep, awk, sed, and jq for efficient Linux text processing. This practical guide covers syntax, real-world examples, and best practices for sysadmins, developers, and data engineers.

Source of the article:Linux Text Processing: Master grep, awk, sed & jq for Developers

If you’ve spent any time working in Linux, you know text processing is non-negotiable. Whether you’re parsing gigabytes of server logs, extracting insights from CSV files, automating config edits, or wrangling JSON from APIs—you need tools that work fast and flexibly.

The good news? Linux comes with four built-in powerhouses: grep, awk, sed, and jq. Each has a unique superpower, but together they handle 90% of text-related tasks. In this guide, we’re skipping the dry theory and focusing on what you can actually use today. Let’s dive in.

Introduction to Linux Text Processing Tools

Text processing in Linux boils down to four core tasks: searching, extracting, editing, and parsing structured data. These tools are lightweight, pre-installed on most distributions, and designed for command-line efficiency. Here’s a quick breakdown of their roles:

grep: The “search master” for finding patterns in text
awk: The “data wizard” for extracting and calculating from structured text
sed: The “stream editor” for batch text modifications
jq: The “JSON hero” for filtering and transforming JSON data

1. grep: Find Text Like a Pro

grep (short for Global Regular Expression Print) is your first stop for locating lines that match a pattern. It’s lightning-fast, even on large files, and supports regular expressions for granular searches.

Key Features

Works with basic and extended regular expressions
Searches recursively through directories
Offers case-insensitive, line numbering, and inverse match options

Practical Examples

Basic Search: Find all “ERROR” entries in a log file:

grep "ERROR" server.log

Case-Insensitive + Line Numbers: Catch “error” or “Error” with line numbers:

grep -i -n "error" server.log

Recursive Search: Find “TODO” comments in all Python files:

grep -r "TODO" *.py

Invert Match: Show lines that don’t contain “DEBUG” (great for filtering noise):

grep -v "DEBUG" server.log

2. awk: Extract & Analyze Structured Data

awk isn’t just a tool—it’s a mini-programming language for text. It excels at processing line-by-line structured data (like CSVs or logs with fixed fields) by splitting lines into columns and applying logic.

Key Features

Splits lines into customizable fields (default: whitespace)
Supports conditionals, loops, and arithmetic
Uses BEGIN/END blocks for setup/teardown tasks

Practical Examples

Extract CSV Fields: Print names and cities from users.csv (columns: name,age,city):

awk -F',' '{print $1", "$3}' users.csv

Output:

Alice, New York
Bob, London
Charlie, Paris

Conditional Filtering: List users older than 30:

awk -F',' '$2 > 30 {print $1}' users.csv

Calculate Totals: Sum all ages in the CSV:

awk -F',' '{sum += $2} END {print sum}' users.csv

3. sed: Batch Edit Text Streams

sed (Stream Editor) is built for modifying text without opening files. It’s perfect for find-and-replace, deleting lines, or inserting content—especially in scripts.

Key Features

Performs in-place edits or outputs to the terminal
Uses regex for pattern matching
Non-interactive (ideal for automation)

Practical Examples

Find-and-Replace: Replace “ERROR” with “WARNING” in server.log (preview first):

sed 's/ERROR/WARNING/g' server.log

In-Place Edit: Modify the file directly (add .bak to create a backup: -i.bak):

sed -i 's/ERROR/WARNING/g' server.log

Delete Lines: Remove all lines containing “DEBUG”:

sed '/DEBUG/d' server.log

4. jq: Tame JSON Data

With APIs and JSON configs everywhere, jq is a must-have for parsing JSON in the command line. It turns messy JSON into readable output and lets you filter/transform data with simple syntax.

Key Features

Queries nested JSON objects/arrays
Supports filtering, mapping, and aggregation
Formats output for readability

Practical Examples

Given data.json:

[
  {"name": "Alice", "age": 25, "city": "New York"},
  {"name": "Bob", "age": 30, "city": "London"},
  {"name": "Charlie", "age": 35, "city": "Paris"}
]

Extract Names: Get all names from the JSON array:

jq '.[].name' data.json

Filter by Age: Find users older than 30:

jq '.[] | select(.age > 30) | .name' data.json

Combining Tools: Real-World Pipelines

The real magic happens when you chain these tools with Linux pipes (|). Here are two common workflows:

Example 1: Analyze Web Server Logs

Extract IPs and URLs from 404 errors in access.log:

grep "404" access.log | awk '{print $1, $7}'

Example 2: Transform JSON Logs

Filter /api endpoints and replace status “200” with “OK” in api.log:

jq '.[] | select(.endpoint | startswith("/api"))' api.log | sed 's/"status": 200/"status":"OK"/g'

Pro Tips for Mastery

Test Regex Incrementally: Complex patterns break easily—test parts first with grep -E (extended regex).
Backup Before Editing: Always use sed -i.bak to create backups, or test commands without -i first.
Learn Common Flags: grep: -i (case-insensitive), -r (recursive)
awk: -F (field separator), END (final action)
sed: s/pattern/replace/g (global replace)
jq: .[] (iterate arrays), select() (filter)

man Use Pages: man grep or man jq has deep docs for edge cases.

Final Thoughts

grep, awk, sed, and jq aren’t just “tools”—they’re time-savers that turn tedious text tasks into one-liners. The more you experiment with them (start small: parse a log, edit a CSV), the more they’ll become second nature.

What’s your go-to text processing workflow? Drop a comment below—we’d love to hear how you use these tools in your projects!

DEV Community

Linux Text Processing: Master grep, awk, sed & jq for Developers

Introduction to Linux Text Processing Tools

1. grep: Find Text Like a Pro

Key Features

Practical Examples

2. awk: Extract & Analyze Structured Data

Key Features

Practical Examples

3. sed: Batch Edit Text Streams

Key Features

Practical Examples

4. jq: Tame JSON Data

Key Features

Practical Examples

Combining Tools: Real-World Pipelines

Example 1: Analyze Web Server Logs

Example 2: Transform JSON Logs

Pro Tips for Mastery

Final Thoughts

Top comments (0)