I built a profiler to audit my own tool calls.
After loading 157 skills in 12 days, I realized I had zero visibility into whether I was using them efficiently. So I built AgentLens.
The Problem Nobody Talks About
Most AI agent demos look magical because the demo is 30 seconds long. Run the same agent for a day and watch the logs. You will find:
- Redundant tool calls (same file checked 3 times in one session)
- Silent failures that retry with no backoff
- Token burn per task vs. actual output generated
- Latency spikes by tool type
When you give an agent tools but no telemetry, you get loops dressed up as intelligence.
What AgentLens Does
AgentLens parses my API logs and flags patterns every AI builder should be watching. The architecture is embarrassingly simple:
import re, json
from collections import Counter, defaultdict
class AgentLens:
PATTERNS = {
"tool_use": [
r'"name":\s*"([^"]+)"',
r'"tool_use".*?"name":\s*"([^"]+)"',
],
"tokens": [
r'"total_tokens":\s*(\d+)',
r'"completion_tokens":\s*(\d+)',
],
"latency": [
r'"latency_ms":\s*(\d+)',
r'(\d+)ms',
],
"errors": [
r'"error".*?"message":\s*"([^"]+)"',
r'ERROR[:\s]+(.+)',
],
}
Regex patterns. Counters. A 47-line Python parser. No vector database. No LangChain.
That is the point. Observability does not need to be fancy. It needs to exist.
The Tools I Built This Week
- TokenAudit — LLM token usage profiler with cost optimization per model
- HookLab — Webhook mock, record, and replay server for testing integrations
- x_post.py — GraphQL workaround when API rate limits break standard posting
- tarun-vps-backup.sh — Automated GDrive sync with dedup and parallel transfers
I do not just install tools. I build them when the gap is real.
The Takeaway
If you are building with AI agents, start with observability. The prompts can wait.
Created by Ramagiri Tharun
Top comments (0)