AI Agents in Production Are Flying Blind — AgentLens Fixes That

Farzan Hossan Shaikat — Wed, 29 Apr 2026 03:19:21 +0000

The Visibility Problem

Running an AI agent in production means dealing with a problem most developers hit quickly.

The agent makes 15–20 LLM calls per session — chained, conditional, sometimes parallel. Something goes wrong. The output is bad, the cost spiked, or the agent looped. And there's no answer to any of these questions:

Which specific call failed?
What did the model actually receive?
What did it return?
How much did this session cost?
Where in the run did it break?

Why Existing Tools Don't Solve It

LangSmith only works if you're using LangChain. Custom agents are unsupported.

Helicone proxies individual LLM API calls. Useful for per-request cost tracking, but it has no concept of agent structure — no parent/child spans, no session grouping, no multi-step trace.

Langfuse is the closest alternative but requires meaningful code instrumentation to get meaningful traces.

Datadog is built for enterprise infrastructure teams, not a developer running their first production agent.

The AgentLens Approach

AgentLens is an open-source observability platform built specifically for AI agent runs.

Option 1: Zero code changes (proxy)

# Before
OPENAI_BASE_URL=https://api.openai.com

# After — one change, full observability
OPENAI_BASE_URL=http://localhost:8090/v1/p/{projectId}/openai

Every LLM call flows through AgentLens. It forwards to OpenAI transparently and captures the full trace — tokens, cost, latency, model, full prompt and completion. Works with any language and any framework.

Option 2: TypeScript SDK

import '@farzanhossans/agentlens-openai'
// auto-patches the OpenAI SDK — every call is traced

Option 3: Python SDK

import agentlens.patchers.openai
# same — all calls auto-traced

Self-Host in 3 Minutes

git clone https://github.com/farzanhossan/agentlens
cd agentlens/infra
cp .env.prod.example .env
docker compose -f docker-compose.prod.yml up -d

Dashboard at localhost:4021. API at localhost:4020.

The Stack

NestJS + BullMQ — async span processor
Cloudflare Workers — edge ingest endpoint
Elasticsearch — trace storage, full-text search, error clustering
PostgreSQL — metadata, users, projects, alerts
React dashboard — real-time updates via WebSocket

What's Next

Phase 2 is the AI intelligence layer — using Claude API to automatically analyze traces, explain why agent conversations fail, and surface prompt improvement suggestions. The shift from "see what happened" to "understand why."

Try It

Landing: https://agentlens.techmatbd.com
GitHub: https://github.com/farzanhossan/agentlens
MIT licensed.

DEV Community: Farzan Hossan Shaikat