DEV Community

Cover image for 10 Things You Can Do With Logs Using Garudust Agent đŸĻ…
Garudust
Garudust

Posted on

10 Things You Can Do With Logs Using Garudust Agent đŸĻ…

10 Things You Can Do With Logs Using Garudust Agent đŸĻ…

Most developers treat logs the same way: something breaks, you grep for errors, squint at timestamps, and piece together what happened. That's table stakes.

Garudust Agent is a self-hostable AI agent runtime written in Rust. Pair it with the log-analyst skill from garudust-hub, and your logs become something you can talk to — and that can act on your behalf.

Here are 10 things you can do that go well beyond grepping for ERROR.


Setup in 60 Seconds

# Install
wget -qO- https://github.com/garudust-org/garudust-agent/releases/latest/download/garudust-$(uname -m)-unknown-linux-musl.tar.gz | tar -xz
sudo mv garudust garudust-server /usr/local/bin/

# Configure (picks your LLM provider)
garudust setup

# Install the log skill
garudust skill install log-analyst
Enter fullscreen mode Exit fullscreen mode

Local GPU? Skip the API entirely:

export VLLM_BASE_URL=http://localhost:8000/v1
export GARUDUST_MODEL=Qwen/Qwen3-8B-AWQ

1. đŸ•ĩī¸ Explain Why an Incident Happened

grep finds what. Garudust explains why.

garudust "read /var/log/app.log around 2025-05-14 03:22:00 Âą10 minutes.
Reconstruct the chain of events that led to the service crash.
What was the root cause?"
Enter fullscreen mode Exit fullscreen mode

The agent reads the surrounding context — not just the error line — and builds a timeline:

Timeline reconstruction:
  03:14:22 — Memory usage crossed 80% threshold (first warning)
  03:19:05 — DB connection pool exhausted (upstream pressure)
  03:21:58 — Request queue backed up, workers stalled
  03:22:11 — OOM killer terminated process (root cause)

Root cause: sustained traffic spike + connection pool too small for load
Recommendation: increase pool size from 10 → 25, add circuit breaker
Enter fullscreen mode Exit fullscreen mode

2. 📊 Generate a Weekly Incident Report

Stop writing incident summaries by hand.

garudust "read /var/log/app.log for the past 7 days.
Generate a weekly incident report with:
- Total error count by day
- Top 5 recurring error types
- Mean time between incidents
- Any worsening trends
Format as Markdown."
Enter fullscreen mode Exit fullscreen mode

Pipe it to a file, commit it, send it to Slack — whatever fits your workflow:

garudust "weekly incident report from /var/log/app.log" > reports/week-$(date +%V).md
Enter fullscreen mode Exit fullscreen mode

3. 🔁 Detect Crash Loops Before They Become Outages

A service that restarts every 4 minutes isn't down — your uptime monitor won't catch it. But Garudust will.

garudust "scan /var/log/syslog for the last 2 hours.
Detect any process that has started and stopped more than 3 times.
Flag it as a crash loop candidate with restart intervals."
Enter fullscreen mode Exit fullscreen mode
âš ī¸  CRASH LOOP DETECTED: api-worker
    Restarts in last 2h: 7
    Average interval: 16 minutes
    Last restart: 10 minutes ago
    Pattern: always crashes ~2 min after start (initialization failure likely)
Enter fullscreen mode Exit fullscreen mode

4. đŸŒĄī¸ Baseline "Normal" and Alert on Deviations

What if you don't know what an anomaly looks like — only that something feels off?

garudust "compare /var/log/nginx/access.log from last Monday vs this Monday,
same time window 09:00–12:00.
What's statistically different? Flag anything that deviates by more than 2x."
Enter fullscreen mode Exit fullscreen mode

Garudust doesn't need pre-defined rules. It compares distributions and reasons about what changed:

Deviations vs. last Monday baseline:
  â€ĸ 4xx error rate: 0.3% → 4.1%  (13.7x increase) âš ī¸
  â€ĸ Avg response time: 120ms → 380ms  (3.2x increase) âš ī¸
  â€ĸ /api/search endpoint: 2% → 31% of traffic  (unusual spike)

Hypothesis: new client or bot hitting /api/search aggressively
Enter fullscreen mode Exit fullscreen mode

5. 🔒 Audit Security Events

Spot brute force attempts, suspicious IPs, and unusual access patterns.

garudust "analyze /var/log/auth.log from the last 24 hours.
Find:
- Failed login attempts > 5 from the same IP
- Successful logins from IPs with prior failures
- Any logins at unusual hours (outside 08:00–20:00 local time)
- New user accounts created
Summarize as a security audit."
Enter fullscreen mode Exit fullscreen mode
🔴 Brute Force Attempt:
   IP: 185.234.xx.xx — 847 failed SSH attempts in 3 hours
   âš ī¸  This IP succeeded once at 03:41 — INVESTIGATE IMMEDIATELY

🟡 Off-hours Logins:
   user: deploy  — login at 02:14 from 10.0.1.55 (internal, but unusual)
Enter fullscreen mode Exit fullscreen mode

6. 📈 Build a Performance Degradation Timeline

Know exactly when things started slowing down — not just that they're slow now.

garudust "read /var/log/app.log for the past 30 days.
Plot response time trends by week.
Identify the exact date when p95 latency started increasing.
What changed around that time in the logs?"
Enter fullscreen mode Exit fullscreen mode

This is invaluable for debugging regressions that crept in silently over weeks.


7. 🤖 Auto-Remediate Known Issues

Garudust has a terminal tool. Combine log analysis with action.

garudust "check /var/log/app.log — if you see more than 20 'disk quota exceeded'
errors in the last hour, run: find /tmp -mtime +1 -delete
then report how many files were removed and current disk usage."
Enter fullscreen mode Exit fullscreen mode

Or for a service restart:

garudust "monitor /var/log/api.log — if the error rate exceeds 15% over 5 minutes,
run: sudo systemctl restart api-service
and report what you did and why."
Enter fullscreen mode Exit fullscreen mode

Note: Garudust's GARUDUST_APPROVAL_MODE defaults to smart — it will ask before running destructive commands. Set to auto only in controlled environments.


8. 🔍 Trace a Single Request Across Services

Distributed systems scatter a single user request across multiple log files. Garudust can stitch them back together.

garudust "find all log entries related to request_id=a3f9c2b1 across these files:
/var/log/gateway.log
/var/log/auth-service.log
/var/log/api.log
/var/log/db-proxy.log
Build a complete trace timeline with latency at each hop."
Enter fullscreen mode Exit fullscreen mode
Request trace: a3f9c2b1
  00ms  gateway.log      — request received, routed to auth
  12ms  auth-service.log — token validated
  13ms  api.log          — handler invoked
  891ms api.log          — âš ī¸ waiting on DB (unusually long)
  903ms db-proxy.log     — query executed (full table scan detected)
  905ms gateway.log      — response returned  total: 905ms
Enter fullscreen mode Exit fullscreen mode

9. 📝 Generate a CHANGELOG from Deploy Logs

If your deploy pipeline writes structured logs, you can generate changelogs automatically.

garudust "read /var/log/deploy.log for this month.
Extract all deployments: service name, version, timestamp, who deployed.
Format as a CHANGELOG grouped by week, in Keep a Changelog style."
Enter fullscreen mode Exit fullscreen mode
## [Week 20] — 2025-05-12 to 2025-05-18

### Deployed
- **api-service** v2.4.1 — 2025-05-14 10:32 (alice)
- **worker** v1.9.0 — 2025-05-16 14:15 (bob)
- **gateway** v3.1.2 — 2025-05-17 09:00 (alice)
Enter fullscreen mode Exit fullscreen mode

10. ⏰ Schedule All of the Above

Everything above can run automatically on a cron schedule — no extra infrastructure needed.

TELEGRAM_TOKEN=your_token \
GARUDUST_CRON_JOBS="
0 9 * * 1=Read /var/log/app.log last 7 days, generate weekly incident report, send to telegram;
*/15 * * * *=Check /var/log/app.log last 15 minutes for crash loops or error spikes, alert telegram if found;
0 8 * * *=Audit /var/log/auth.log last 24 hours for security anomalies, send daily security digest to telegram
" \
garudust-server --anthropic-key sk-ant-...
Enter fullscreen mode Exit fullscreen mode

Three jobs. Zero extra services. No Prometheus, no Grafana, no ELK cluster.


The Mental Model Shift

Traditional log tooling asks you to define rules upfront:

"Alert me when error rate > 5%"

Garudust lets you ask questions in plain language — after something happens, or proactively via cron — and get reasoning, not just matching:

"What went wrong this week, and is it getting worse?"

The difference is the jump from pattern matching to understanding.


What's Next

  • Install the log-analyst skill and try #1 on your own logs right now
  • Write a custom SKILL.md for your specific log format in ~/.garudust/skills/
  • Contribute your own log tools to garudust-hub

🔗 Links:


Which of these 10 use cases would save you the most time? Drop a comment — I read every one. 🙌

Top comments (0)