Over the last few weeks, I’ve been working with AI features in production, and I kept running into the same problem:
Vendor dashboards (OpenAI, Anthropic, etc.) are great at showing model-level usage, but once AI is embedded across multiple product features, it becomes hard to answer basic questions like:
- Which feature is actually driving AI cost?
- Where is latency impacting users?
- Which AI feature is failing in production?
API keys and model usage don’t map cleanly to how a product is structured.
What I built
I built a small MVP called Orbit to explore this problem.
It’s a lightweight SDK-based tool that captures real runtime data and shows:
- AI cost per product feature
- Latency per feature
- Error rates per feature
The focus is on feature-level observability, not just infra or model analytics.
How it works (high level)
- A simple SDK wraps AI calls in your code
- Each call is tagged with a feature name
- Runtime data (tokens, latency, errors) is sent to Orbit
- The dashboard shows how AI behaves inside the product
- No proxies, no request interception — just instrumentation.
Who this might be useful for
- Engineers shipping AI-powered features
- Founders running LLMs in production
- Teams trying to understand where AI cost or reliability issues actually come from
*This is very early-stage and currently free.
I’m mainly looking for honest feedback, not signups or validation.
*
What I’d love feedback on
- Is feature-level AI visibility something you’ve needed?
- What metrics would actually matter to you?
- Does this solve a real problem, or is it overkill?
- What would make you come back to a tool like this?
If you’re curious, here’s the link:
👉 https://withorbit.vercel.app
Happy to answer questions here, and equally happy if the feedback is “this isn’t useful.”
Thanks for reading 🙏
Top comments (0)