Why CLI Tools Still Matter
In a world of flashy GUIs and cloud consoles, CLI tools remain the secret weapon of productive data engineers.
Why?
- Speed — No mouse hunting, no loading screens
- Scriptability — Automate everything with bash, Python, or cron
- Remote-friendly — SSH into any server and you're productive immediately
-
Composability — Pipe output between tools (
tool1 | tool2 | tool3) - Lower resource usage — No Electron apps eating 2GB of RAM
The best data engineers I know live in the terminal. Here are the 7 CLI tools that will 10x your productivity in 2025.
1. Frosty — AI Agent for Multi-Database Operations 🥇
GitHub: https://github.com/Gyrus-Dev/frosty
Best for: Natural language database operations across Snowflake, PostgreSQL, SQL Server
Price: Free (open-source)
What It Does
Frosty is an AI agent that translates plain English into database operations. Instead of writing SQL or DDL manually, you type:
"show me my top 10 customers by revenue last quarter"
"set up MFA for all users without it"
"why is my warehouse spend up 40% this month?"
"create a data pipeline for S3 CSV ingestion with daily loads"
Frosty handles schema discovery, query generation, safety validation, and execution.
Why It's #1
| Feature | Frosty | Traditional Tools |
|---|---|---|
| Natural language input | ✅ | ❌ |
| Multi-database support | ✅ (Snowflake + Postgres + SQL Server) | Varies |
| Auto-generate DDL | ✅ | ❌ |
| Safety gates (block DROP) | ✅ | ❌ |
| Self-hosted | ✅ | Sometimes |
| Price | Free | $0–$$$ |
Real-World Use Cases
Data Engineering:
> "set up a data pipeline for S3 CSV ingestion"
# Generates: external stage, file format, table, COPY INTO, task schedule
# Time: 5 minutes vs. 2-3 hours manually
Security:
> "enforce MFA for all users without it"
# Generates ALTER statements, shows preview, executes with approval
Cost Governance:
> "why is warehouse spend up 40% this month?"
# Queries metering views, returns itemized breakdown with recommendations
Setup
git clone https://github.com/Gyrus-Dev/frosty.git
cd frosty
pip install -r requirements.txt
# Configure .env with credentials + API key
python -m src.frosty_ai.objagents.main
Model Support
- OpenAI (GPT-4, GPT-4o)
- Anthropic (Claude 3.5, Claude 3)
- Google (Gemini 2.0, Gemini 1.5)
- Ollama (local, free)
Swap models in one .env line — no code changes.
Safety
-
DROPunconditionally blocked in code -
CREATE OR REPLACErequires explicit approval - Read-only modes for analyst roles
- Audit logging of all executed statements
Verdict
Must-have for any data engineer working with Snowflake, PostgreSQL, or SQL Server. The time savings on repetitive tasks alone justify installation. Multi-database unified interface launching late April 2025.
Rating: ⭐⭐⭐⭐⭐ (5/5)
2. dbt (Data Build Tool) — SQL Transformations
Website: https://getdbt.com
Best for: Analytics engineering, SQL-based transformations in data warehouses
Price: Free (Core), $$$ (Cloud)
What It Does
dbt lets you write SQL transformations that are version-controlled, tested, and documented. You write SELECT statements; dbt handles DDL, dependencies, and orchestration.
Example Workflow
-- models/customers.sql
{{
config(
materialized='table',
tags=['customers', 'core']
)
}}
select
customer_id,
sum(revenue) as total_revenue,
count(order_id) as order_count
from {{ ref('orders') }}
group by 1
Run with:
dbt run --select customers
dbt test --select customers
dbt docs generate
Key Features
-
Modular SQL — Use
ref()to build dependency graphs -
Testing — Built-in assertions (
unique,not_null,accepted_values) - Documentation — Auto-generated data dictionary
- CI/CD — Integrate with GitHub Actions, GitLab CI
- Environments — Dev, staging, prod with separate schemas
When to Use It
✅ Building analytics models in Snowflake/BigQuery/Redshift
✅ Team collaboration on SQL transformations
✅ Need for testing and documentation
❌ Real-time streaming (use dbt + Airflow or dbt + Kafka)
❌ Non-SQL transformations (use Python + Pandas instead)
Verdict
Industry standard for analytics engineering. If you're doing SQL transformations in a data warehouse, dbt should be in your stack.
Rating: ⭐⭐⭐⭐⭐ (5/5)
3. pgcli / mycli — Enhanced Database Clients
GitHub: https://github.com/dbcli/pgcli (PostgreSQL)
GitHub: https://github.com/dbcli/mycli (MySQL/MariaDB)
Best for: Interactive SQL with auto-completion and syntax highlighting
Price: Free (open-source)
What They Do
These are drop-in replacements for psql and mysql with superpowers:
-
Auto-completion — Table names after
FROM, column names afterWHERE - Syntax highlighting — Colorized SQL in the terminal
- Smart completion — Context-aware suggestions
- Multi-line mode — Write complex queries across lines
- Fuzzy search — Quickly find tables/columns
Example
# Instead of psql
psql -h localhost -U postgres -d mydb
# Use pgcli
pgcli -h localhost -U postgres -d mydb
Now type:
SELECT c. # Auto-completes: customer_id, customer_name, created_at, ...
FROM customers c
WHERE c. # Auto-completes column names again
Key Features
| Feature | pgcli/mycli | psql/mysql |
|---|---|---|
| Auto-completion | ✅ | ❌ (basic in psql) |
| Syntax highlighting | ✅ | ❌ |
| Smart completion | ✅ | ❌ |
| Multi-line mode | ✅ | Manual (\e in psql) |
| Fuzzy search | ✅ | ❌ |
When to Use It
✅ Daily SQL work in PostgreSQL or MySQL
✅ Want better UX than default CLI
✅ Working with large schemas (auto-completion saves time)
❌ Need GUI features (ER diagrams, visual query builder)
❌ Working with unsupported databases (only Postgres/MySQL)
Verdict
Instant productivity boost for PostgreSQL and MySQL users. Installation takes 30 seconds; you'll wonder how you lived without it.
Rating: ⭐⭐⭐⭐⭐ (5/5)
4. ingestr — Data Copying Between Databases
GitHub: https://github.com/igorbarinov/ingestr
Best for: Copying data between databases without custom code
Price: Free (open-source)
What It Does
ingestr copies data from source to destination databases with support for incremental loads. No custom Python scripts needed.
Example
# Copy entire table
ingestr copy \
--source postgres://user:pass@localhost/mydb \
--source-table users \
--destination snowflake://account/user/pass \
--destination-table analytics.users
# Incremental load (append only)
ingestr copy \
--source postgres://... \
--source-table orders \
--destination snowflake://... \
--destination-table analytics.orders \
--mode append \
--incremental-key created_at
Supported Sources/Destinations
- PostgreSQL
- MySQL
- SQL Server
- BigQuery
- Snowflake
- Redshift
- CSV/Parquet files
Key Features
-
Incremental loads —
append,delete+insert,create+replace - Schema auto-detection — No manual DDL writing
- Scheduling — Integrate with cron, Airflow, Prefect
- Logging — Row counts, timing, error handling
When to Use It
✅ Copying data between databases regularly
✅ Need incremental loads (CDC-like behavior)
✅ Want to avoid custom ETL scripts
❌ Complex transformations (use dbt or Spark)
❌ Real-time streaming (use Kafka + connectors)
Verdict
Huge time-saver for data replication tasks. Replaces hours of custom script writing with a single command.
Rating: ⭐⭐⭐⭐ (4/5)
5. fzf — Fuzzy Finder
GitHub: https://github.com/junegunn/fzf
Best for: Quickly searching files, commands, history
Price: Free (open-source)
What It Does
fzf is a general-purpose fuzzy finder that lets you search anything in the terminal interactively.
Examples
# Search command history
# Press Ctrl+R, start typing, fzf finds matching commands
# Find and open a file
vim $(fzf)
# Search SQL files in a project
cat $(find . -name "*.sql" | fzf)
# Kill a process interactively
ps aux | fzf | awk '{print $2}' | xargs kill -9
# Preview file contents while searching
fzf --preview 'head -100 {}'
Key Features
-
Fuzzy matching — Type
custmrand it findscustomer_orders.sql - Interactive — Filter results as you type
- Preview — See file contents before selecting
- Keybindings — Ctrl+R for history, Ctrl+T for files
- Integration — Works with vim, emacs, bash, zsh
When to Use It
✅ Large codebases with many files
✅ Frequent command history searches
✅ Want to navigate faster in the terminal
❌ Prefer GUI file explorers
❌ Working with small projects (less value)
Verdict
Universal productivity booster. Once you get used to fzf, you'll miss it in every other terminal session.
Rating: ⭐⭐⭐⭐⭐ (5/5)
6. jq — JSON Processor
GitHub: https://github.com/jqlang/jq
Best for: Parsing, filtering, transforming JSON in the terminal
Price: Free (open-source)
What It Does
jq is a lightweight command-line JSON processor. It's essential for working with APIs, logs, and JSON-heavy data pipelines.
Examples
# Pretty-print JSON
curl https://api.example.com/data | jq
# Extract specific fields
curl https://api.example.com/users | jq '.[] | {name, email}'
# Filter by condition
curl https://api.example.com/orders | jq '.[] | select(.status == "pending")'
# Aggregate data
curl https://api.example.com/orders | jq '[.[] | .amount] | add'
# Transform structure
curl https://api.example.com/data | jq '{total: .count, items: .results}'
Key Features
- SQL-like syntax — Filter, map, reduce JSON
- Chaining — Pipe multiple operations
- Error handling — Graceful failures on malformed JSON
- Colorized output — Readable by default
When to Use It
✅ Working with REST APIs
✅ Parsing JSON logs
✅ Transforming JSON for downstream systems
❌ XML data (use xmllint instead)
❌ Complex transformations (use Python + Pandas)
Verdict
Essential for any data engineer working with APIs or JSON data. The learning curve is steep but worth it.
Rating: ⭐⭐⭐⭐⭐ (5/5)
7. httpie — HTTP Client
Website: https://httpie.io
Best for: API testing and debugging
Price: Free (CLI), $$$ (Desktop app)
What It Does
httpie is a user-friendly HTTP client for the terminal. It's like curl but with readable syntax and colorized output.
Examples
# GET request
http GET https://api.example.com/users
# POST with JSON body
http POST https://api.example.com/users name="John" email="john@example.com"
# With authentication
http --auth user:pass GET https://api.example.com/protected
# Download a file
http --download GET https://example.com/file.zip
# Set custom headers
http GET https://api.example.com/data X-API-Key:abc123
Key Features
| Feature | httpie | curl |
|---|---|---|
| Readable syntax | ✅ | ❌ (flags everywhere) |
| Colorized output | ✅ | ❌ (needs flags) |
| JSON auto-formatting | ✅ | ❌ (pipe to jq) |
| Interactive mode | ✅ | ❌ |
| Download files | ✅ | ✅ |
When to Use It
✅ Testing REST APIs
✅ Debugging webhook payloads
✅ Quick API experiments
❌ Production scripts (curl is more universal)
❌ Complex authentication flows (use Postman/Insomnia)
Verdict
Better than curl for interactive API work. The syntax is intuitive, and the output is immediately readable.
Rating: ⭐⭐⭐⭐ (4/5)
Honorable Mentions
These didn't make the top 7 but are still worth knowing:
| Tool | Purpose |
|---|---|
| bat |
cat with syntax highlighting and Git integration |
| ripgrep (rg) | Faster grep for code search |
| tmux | Terminal multiplexer for session management |
| httpie | Already covered, but worth repeating |
| peepDB | Quick database table inspection without SQL |
| Sling | Another data copying tool (alternative to ingestr) |
| Meltano | ELT orchestration with CLI interface |
How to Get Started
This Weekend
Pick 2-3 tools and install them:
# Frosty (AI database agent)
git clone https://github.com/Gyrus-Dev/frosty.git
cd frosty && pip install -r requirements.txt
# pgcli (if you use PostgreSQL)
pip install pgcli
# fzf (fuzzy finder)
brew install fzf # macOS
sudo apt install fzf # Ubuntu
# jq (JSON processor)
brew install jq # macOS
sudo apt install jq # Ubuntu
# httpie (HTTP client)
brew install httpie # macOS
sudo apt install httpie # Ubuntu
Next Week
- Try each tool on a real task (not just installation)
- Replace one existing workflow with a CLI alternative
- Share what you learned with your team
Long-Term
- Build a personal toolkit of 10-15 go-to CLI tools
- Script repetitive tasks using these tools
- Contribute to open-source CLI projects you use
Final Thoughts
CLI tools aren't about rejecting modern GUIs — they're about having the right tool for the job.
Some tasks are better in a GUI (ER diagrams, visual query building). But for:
- Quick data exploration
- Repetitive admin tasks
- Automation and scripting
- Remote server work
CLI tools are unmatched.
Start with Frosty for database operations, add fzf for navigation, and jq for JSON work. You'll be shocked at how much faster you move.
Want to try Frosty?
- 🐙 GitHub — Star to get notified about multi-DB launch
- 💬 Discord — Join the community
- 📧 Contact — Questions or demo requests
About the author: This list was compiled by the team behind Frosty, an AI agent for multi-database operations. We live in the terminal and love sharing tools that make data engineering faster.
Top comments (0)