The Visibility Problem
Running an AI agent in production means dealing with a problem most developers hit quickly.
The agent makes 15–20 LLM calls per session — chained, conditional, sometimes parallel. Something goes wrong. The output is bad, the cost spiked, or the agent looped. And there's no answer to any of these questions:
- Which specific call failed?
- What did the model actually receive?
- What did it return?
- How much did this session cost?
- Where in the run did it break?
Why Existing Tools Don't Solve It
LangSmith only works if you're using LangChain. Custom agents are unsupported.
Helicone proxies individual LLM API calls. Useful for per-request cost tracking, but it has no concept of agent structure — no parent/child spans, no session grouping, no multi-step trace.
Langfuse is the closest alternative but requires meaningful code instrumentation to get meaningful traces.
Datadog is built for enterprise infrastructure teams, not a developer running their first production agent.
The AgentLens Approach
AgentLens is an open-source observability platform built specifically for AI agent runs.
Option 1: Zero code changes (proxy)
# Before
OPENAI_BASE_URL=https://api.openai.com
# After — one change, full observability
OPENAI_BASE_URL=http://localhost:8090/v1/p/{projectId}/openai
Every LLM call flows through AgentLens. It forwards to OpenAI transparently and captures the full trace — tokens, cost, latency, model, full prompt and completion. Works with any language and any framework.
Option 2: TypeScript SDK
import '@farzanhossans/agentlens-openai'
// auto-patches the OpenAI SDK — every call is traced
Option 3: Python SDK
import agentlens.patchers.openai
# same — all calls auto-traced
Self-Host in 3 Minutes
git clone https://github.com/farzanhossan/agentlens
cd agentlens/infra
cp .env.prod.example .env
docker compose -f docker-compose.prod.yml up -d
Dashboard at localhost:4021. API at localhost:4020.
The Stack
- NestJS + BullMQ — async span processor
- Cloudflare Workers — edge ingest endpoint
- Elasticsearch — trace storage, full-text search, error clustering
- PostgreSQL — metadata, users, projects, alerts
- React dashboard — real-time updates via WebSocket
What's Next
Phase 2 is the AI intelligence layer — using Claude API to automatically analyze traces, explain why agent conversations fail, and surface prompt improvement suggestions. The shift from "see what happened" to "understand why."
Try It
Landing: https://agentlens.techmatbd.com
GitHub: https://github.com/farzanhossan/agentlens
MIT licensed.
Top comments (0)