DEV Community

Alex Spinov
Alex Spinov

Posted on

Langfuse Has a Free API: The Open-Source LLM Observability Platform That Tracks Every Token, Prompt, and Cost of Your AI App

Your AI app is in production. Users complain about bad responses but you can't reproduce the issue. You don't know which prompts perform best, how much each conversation costs, or why latency spiked at 3pm. Langfuse is the missing observability layer for LLM applications.

What Langfuse Actually Does

Langfuse is an open-source LLM engineering platform. It traces every LLM call in your application — capturing inputs, outputs, latency, token usage, cost, and user feedback. Think Datadog but specifically designed for AI applications.

The platform provides: tracing (follow a request through your entire LLM chain), prompt management (version and A/B test prompts), evaluations (automated quality scoring), analytics (cost per user, latency percentiles, token usage trends), and datasets (build test sets from production data).

Langfuse integrates with OpenAI, Anthropic, LangChain, LlamaIndex, and any custom LLM setup. Self-hosted (free, open-source MIT) or Langfuse Cloud (free tier: 50K observations/month).

Quick Start

pip install langfuse
Enter fullscreen mode Exit fullscreen mode

Drop-in OpenAI replacement (zero code changes):

from langfuse.openai import openai

# Use exactly like the OpenAI SDK — Langfuse traces automatically
client = openai.OpenAI()

response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Explain quantum computing"}],
    metadata={"user_id": "user-123", "session_id": "sess-456"}
)
# Langfuse captures: prompt, response, tokens, cost, latency — automatically
Enter fullscreen mode Exit fullscreen mode

Manual tracing for complex chains:

from langfuse import Langfuse

langfuse = Langfuse()

trace = langfuse.trace(name="rag-pipeline", user_id="user-123")

# Trace retrieval step
span = trace.span(name="vector-search")
docs = vector_db.search(query, top_k=5)
span.end(output={"doc_count": len(docs)})

# Trace LLM call
generation = trace.generation(
    name="answer-generation",
    model="gpt-4",
    input=[{"role": "user", "content": query}],
    model_parameters={"temperature": 0.7}
)
response = openai_client.chat.completions.create(...)
generation.end(output=response.choices[0].message.content)
Enter fullscreen mode Exit fullscreen mode

3 Practical Use Cases

1. Cost Tracking Per User

# Tag every LLM call with user info
trace = langfuse.trace(
    name="chat",
    user_id=user.id,
    metadata={"plan": user.plan, "feature": "code-review"}
)

# In Langfuse dashboard:
# - Cost per user per day
# - Token usage by feature
# - Which users consume the most
# - ROI: cost vs user's subscription revenue
Enter fullscreen mode Exit fullscreen mode

2. Prompt A/B Testing

# Fetch prompt from Langfuse (versioned, A/B testable)
prompt = langfuse.get_prompt("summarizer", label="production")

response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "system", "content": prompt.compile(max_length=200)}],
    langfuse_prompt=prompt  # Links trace to prompt version
)
Enter fullscreen mode Exit fullscreen mode

Update prompts in Langfuse dashboard — no code deploys. Compare v1 vs v2 with real production data.

3. Automated Evaluation

# Score responses automatically
trace.score(
    name="relevance",
    value=evaluate_relevance(query, response),  # Your custom eval
    comment="Automated relevance check"
)

# Or let users score
trace.score(
    name="user-feedback",
    value=1,  # Thumbs up
    comment="User clicked helpful"
)
Enter fullscreen mode Exit fullscreen mode

Track quality metrics over time. Alert when scores drop.

Why This Matters

Running LLMs in production without observability is like running a web app without logging. You can't debug issues, optimize costs, or improve quality. Langfuse gives you the telemetry you need with minimal integration effort. The drop-in OpenAI wrapper means you can start tracing in 2 minutes.


Need custom data extraction or web scraping solutions? I build production-grade scrapers and data pipelines. Check out my Apify actors or email me at spinov001@gmail.com for custom projects.

Follow me for more free API discoveries every week!

Top comments (0)