Langfuse is an open-source LLM observability platform — trace, evaluate, and monitor your AI applications.
What You Get for Free (Free Tier / Self-hosted)
- Tracing — see every LLM call, prompt, and response
- Cost tracking — track token usage and costs per request
- Latency — monitor response times across models
- Evaluations — score outputs with custom metrics
- Prompt management — version and A/B test prompts
- User feedback — collect thumbs up/down on responses
- Datasets — create evaluation datasets from production data
- Integrations — OpenAI, Anthropic, LangChain, LlamaIndex
Quick Start
pip install langfuse
from langfuse import Langfuse
from openai import OpenAI
langfuse = Langfuse()
client = OpenAI()
trace = langfuse.trace(name="chat")
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Hello"}]
)
trace.generation(
name="chat-completion",
model="gpt-4",
input=[{"role": "user", "content": "Hello"}],
output=response.choices[0].message.content,
usage={"input": response.usage.prompt_tokens, "output": response.usage.completion_tokens}
)
Why Developers Switch from Logging to Console
console.log doesn't scale for AI debugging:
- Full trace — see prompt → response → score in one view
- Cost visibility — know exactly how much each user costs
- Prompt versioning — compare prompt v1 vs v2 performance
- Evaluation — automated quality scoring, not manual review
An AI chatbot had random hallucinations but no way to debug them. After Langfuse: identified that 90% of bad outputs came from one prompt template, fixed it in 10 minutes, hallucination rate dropped from 12% to 2%.
Need Custom Data Solutions?
I build production-grade scrapers and data pipelines for startups, agencies, and research teams.
Browse 88+ ready-made scrapers on Apify → — Reddit, HN, LinkedIn, Google, Amazon, and more.
Custom project? Email me: spinov001@gmail.com — fast turnaround, fair pricing.
Top comments (0)