DEV Community

Saud
Saud

Posted on

Per-customer LLM cost attribution for multi-step agents (LangGraph, CrewAI)

Genuinely curious: how are people handling per-customer LLM cost attribution once you get past a handful of customers?

I've been digging into something that keeps coming up in conversations with founders building agent-based products — specifically around understanding what each customer actually costs you to serve when your backend is a LangGraph or CrewAI workflow.

At small scale it's manageable. You can eyeball it, maybe throw some numbers in a spreadsheet. But somewhere around 20-30 customers things seem to get messy fast — especially when a single "user action" triggers a multi-step agent that's spawning sub-agents, hitting multiple models, using different context window sizes per step.

The question I keep sitting with: how do you actually know what each customer costs you at that level of granularity?

I've talked to a few people who built custom logging middleware, others who are stitching together LangSmith traces with manual cost calculations after the fact, and a few who honestly admitted they just average it out and accept some margin uncertainty. None of those feel great as a long-term approach, especially if you're trying to offer any kind of usage-based pricing to your own customers.

Specific things I'm trying to understand:

  • Is per-customer cost attribution actually something you've run into as a real operational problem, or is it more of a "nice to have" that rarely bites you?
  • At what point in growth did it become painful, if it did?
  • What have you actually done about it — built something, bought something, ignored it?
  • If you're running multi-step agents specifically, do you care about cost visibility at the agent-step level, or is total cost per customer request enough?

Not trying to push anything here — genuinely trying to understand whether this is a sharp pain or a dull inconvenience for people doing this work at real scale.

Would appreciate hearing from anyone who's actually wrestled with this. Even "we just accepted we don't know" is a useful data point.

Top comments (0)