Genuinely curious: how are people handling per-customer LLM cost attribution once you get past a handful of customers?
I've been digging into something that keeps coming up in conversations with founders building agent-based products — specifically around understanding what each customer actually costs you to serve when your backend is a LangGraph or CrewAI workflow.
At small scale it's manageable. You can eyeball it, maybe throw some numbers in a spreadsheet. But somewhere around 20-30 customers things seem to get messy fast — especially when a single "user action" triggers a multi-step agent that's spawning sub-agents, hitting multiple models, using different context window sizes per step.
The question I keep sitting with: how do you actually know what each customer costs you at that level of granularity?
I've talked to a few people who built custom logging middleware, others who are stitching together LangSmith traces with manual cost calculations after the fact, and a few who honestly admitted they just average it out and accept some margin uncertainty. None of those feel great as a long-term approach, especially if you're trying to offer any kind of usage-based pricing to your own customers.
Specific things I'm trying to understand:
- Is per-customer cost attribution actually something you've run into as a real operational problem, or is it more of a "nice to have" that rarely bites you?
- At what point in growth did it become painful, if it did?
- What have you actually done about it — built something, bought something, ignored it?
- If you're running multi-step agents specifically, do you care about cost visibility at the agent-step level, or is total cost per customer request enough?
Not trying to push anything here — genuinely trying to understand whether this is a sharp pain or a dull inconvenience for people doing this work at real scale.
Would appreciate hearing from anyone who's actually wrestled with this. Even "we just accepted we don't know" is a useful data point.
Top comments (1)
hit this around customer 20. been there.built something for it called AgentCOGS (agentcogs.dev). two lines around your langgraph agent, you get cost per customer per run. model breakdown, node breakdown, the works.one thing i learned the hard way - dont use local tokenizers for cost tracking. anthropic's published one is off by 15-20% on some claude models. use provider reported counts from the api response, always.also does a pre-run budget check so if a customer hit their cap it raises before any llm call happens, not after you've already burned the tokens.happy to jump on a call and install it together if useful