I built Farol — AI agent observability in one decorator (open source)

#ai #python #devops #opensource

The problem
I kept finding out my AI agents broke from user complaints — not from my own monitoring.
An agent would run 47 times overnight. Three failed. Costs spiked 3×. I found out the next morning from a user ticket.
Every existing tool was either enterprise-grade overkill (Datadog, New Relic) or required complex setup (Langfuse, Helicone). Nothing was built for a solo dev who just wants to know when something goes wrong.
So I built Farol.
One decorator. That's it.

from farol import trace

@trace(agent_name="my-agent", farol_key="frl_...")
def my_agent(task, run=None):
    run["topic"] = task
    # your agent code here

That's all you need to get started. Farol tracks everything automatically.

What you get
Cost anomaly detection — Farol learns what's normal for each agent and alerts you when a run costs 3× more than usual. Before your cloud bill does.
Full trace inspector — Every tool call, LLM call, duration, tokens, and error reconstructed end to end. Click any run and see exactly what happened at each step.
Agent Health Score — One number (0-100) combining success rate, cost efficiency, quality, and latency. Changes week over week so you know if things are getting better or worse.
Quality scoring — Rate outputs thumbs up or down. Farol tracks quality trends and alerts you when it degrades — catches prompt regressions before your users do.
Proactive trend alerts — Fires when your agent is trending slower or more expensive, even when nothing has technically broken yet.
Weekly digest email — Every Monday, a summary of your agents' health, cost, and quality. Green means sleep. Red means act.
Multi-agent trace linking — Pass parent_trace_id to link child agent runs to their parent. See total pipeline cost across the entire chain.

Works with any framework
LangChain, CrewAI, Vercel AI SDK, or raw Anthropic/OpenAI calls. If it's a Python or Node.js function, @trace wraps it.

# LangChain

@trace(agent_name="langchain-agent", farol_key="frl_...")
def run_chain(query, run=None):
    run["topic"] = task
    with run.span("chain_invoke", type="tool") as span:
        result = chain.invoke({"query": query})
    return result

Open source SDK
The SDK is MIT licensed and published on PyPI and npm:

pip install farol-sdk

Or for Node.js:

npm install @usefarol/sdk

The hosted dashboard (where alerts fire, data lives) is the paid part. Free tier available — 1 agent, 50k events/month, no credit card required.

What I learned building it
The hardest part wasn't the monitoring itself — it was the statistical baseline detection. A naive "alert when cost exceeds X" creates too many false positives. Farol uses a rolling median + standard deviation baseline that requires 10+ successful runs before firing. This means no noise in the first few days, and reliable signals after.
The other insight: proactive alerts matter more than reactive ones. Most monitoring tools tell you when something breaks. Farol also tells you when things are trending in the wrong direction — before they break. That's the Day 30 retention feature.

Try it

Live demo (no signup): usefarol.dev/demo
Docs: usefarol.dev/docs
GitHub: github.com/hugoapolinario/farol

Would love feedback — what's missing? What would you add?