Sindhu Murthy

Posted on Feb 18

Billing & Account Issues: A Support Engineer's Runbook

#ai #api #billing #support

Who this is for: This runbook is a practical reference for support engineers and anyone preparing for a support engineering role with AI API providers. It covers the 6 most common billing incident types — how to diagnose them, how to fix them, and what to communicate to customers. Patterns here apply across providers including OpenAI, Anthropic, Google, Cohere, and others.

⚡ Quick Reference

Match the customer's symptom to the incident type, then jump to that section.

Customer Says	Jump To
🚫 "My API calls suddenly stopped working"	Incident 1 — Payment Failure / Credit Exhaustion
😱 "My bill is way higher than I expected"	Incident 2 — Unexpected High Bill
📍 "I hit my limit and the API stopped"	Incident 3 — Spending Limit Reached
⏳ "My free credits ran out"	Incident 4 — Free Tier / Trial Expiry
💸 "I want a refund for accidental charges"	Incident 5 — Refund Request
🔒 "My account has been suspended / locked"	Incident 6 — Account Suspension

🔵 Before anything else: Always check the provider's status page first (e.g. status.openai.com, status.anthropic.com). If there is an active incident, that is your answer — inform the customer and monitor. Do not proceed further until you have ruled out a provider-side outage.

Incident 1 — Payment Failure / Credit Exhaustion
Incident 2 — Unexpected High Bill
Incident 3 — Spending Limit Reached Without Warning
Incident 4 — Free Tier / Trial Credit Expiry
Incident 5 — Refund Request for Accidental Usage
Incident 6 — Account Suspension
Master Decision Tree
Support Engineer Troubleshooting Checklist

Incident 1 — Payment Failure / Credit Exhaustion {#incident-1}

Error: 402 Payment Required — API access stops immediately

What the customer says

"My API calls were working fine and then suddenly stopped."
"I'm getting 402 errors on every request."
"Nothing in my code changed."

What actually happened

AI API providers stop access immediately and without a grace period when a payment fails or a prepaid credit balance hits $0. Unlike a SaaS subscription that might give you days to fix a payment issue, the API cuts off the moment the billing system flags a failure.

Root Cause	How to Confirm It
Card expired or declined	Provider dashboard → Billing → red banner or failed payment status
Prepaid credit balance at $0	Dashboard → Billing → credit balance shows $0.00
Auto-recharge enabled but card declined	Dashboard → Billing → payment history → failed recharge entry
Invoice overdue (enterprise accounts)	Dashboard → Billing → Invoices → unpaid invoice

Diagnosis Flow

402 Payment Required
  │
  ├── Provider Dashboard → Billing section
  │     │
  │     ├── Red banner or "Payment failed" message?
  │     │     └── Card expired / declined
  │     │           FIX: Ask customer to update card.
  │     │               Dashboard → Billing → Payment methods → Update
  │     │               API resumes within ~5 minutes of payment clearing.
  │     │
  │     ├── Credit balance shows $0?
  │     │     ├── Auto-recharge OFF
  │     │     │     FIX: Add credits manually + enable auto-recharge.
  │     │     │
  │     │     └── Auto-recharge ON but balance still $0
  │     │           → The recharge itself failed (card issue).
  │     │             FIX: Same as expired/declined card above.
  │     │
  │     └── Invoice overdue? (enterprise customers)
  │           FIX: Route to finance team for payment processing.
  │
  └── Dashboard looks fine — balance > $0, card valid?
        → Rare sync delay. Wait 10 minutes.
          Still failing? Escalate with account ID + timestamps.

How to fix it

Fix	Where	Time to Resolution
Update payment card	Dashboard → Billing → Payment methods	~5 min after payment clears
Add prepaid credits	Dashboard → Billing → Add credits	~2–5 minutes
Enable auto-recharge	Dashboard → Billing → Auto-recharge settings	Prevents future incidents

What to tell the customer

"Your payment method needs to be updated. Go to your provider dashboard under Billing → Payment methods, update your card, and API access should resume within a few minutes. I'd also recommend enabling auto-recharge so your balance never hits zero unexpectedly."

💚 Post-resolution: Always recommend enabling auto-recharge with a top-up threshold set to at least 2× the customer's average daily spend. This single setting prevents the majority of these tickets.

Incident 2 — Unexpected High Bill {#incident-2}

Customer's invoice is significantly higher than expected

What the customer says

"My bill last month was $40. This month it's $800. Nothing changed."
"I think I'm being charged incorrectly."
"We only have 200 users — how is this possible?"

What actually happened

Something in the customer's usage changed — even if they don't know what. In practice, 95% of high-bill tickets trace back to one of five root causes. Your job is to identify which one using the Usage dashboard.

Root Cause	What It Looks Like in Usage Dashboard	How Common
Runaway loop — bug calling API thousands of times	One day with a massive spike, thousands of requests in minutes	Very common
Model swap — switched to a more expensive model	Usage shifts to a pricier model mid-month	Very common
Context bloat — sending full documents instead of chunks	High token count per request, not high request count	Common
Retry storm — failed requests retrying without backoff	Clusters of identical requests at the same timestamps	Common
Dev key in production — test environment hitting real API	Usage spikes during business hours or CI/CD run times	Moderate

Planned vs Unplanned Model Changes — Know the Difference

Using multiple models intentionally for different tasks is one of the best cost strategies in AI engineering — not a problem. The issue is when a model change happens accidentally: a developer swaps a model name in one place without checking the pricing impact, and the bill spikes before anyone notices.

	✅ Intentional Multi-Model Routing	❌ Accidental Model Swap
What it is	Deliberately using cheap models for simple tasks, expensive ones for complex tasks	Someone changes a model name in code without checking pricing
Planned?	Yes — documented in architecture	No — discovered on the invoice
Is it a problem?	No — this is best practice	Yes — surprise bill with no warning
Example	Classification → economy model; complex reasoning → premium model	gpt-4o-mini quietly changed to gpt-4o in a config file

Smart Multi-Model Routing — Recommended Approach

Task Type	Recommended Model Tier	Why
Classification, routing, tagging, simple Q&A	Economy (e.g. gpt-4o-mini, claude-haiku, gemini-flash)	Doesn't need deep reasoning
Customer-facing chat, summarisation	Standard (e.g. gpt-4o, claude-sonnet, gemini-pro)	Good quality-to-cost balance
Complex analysis, code, legal/financial reasoning	Premium (e.g. o1, claude-opus, gemini-ultra)	Worth the cost when accuracy matters

The Pricing Gap That Catches People Off Guard

Model Tier	Examples	Approx. Cost per 1M input tokens	Relative Cost
Economy / Lightweight	gpt-4o-mini, claude-haiku, gemini-flash	~$0.10–0.20	🟢 Cheapest
Standard	gpt-4o, claude-sonnet, gemini-pro	~$2.50–3.00	🟠 ~15–20× more
Premium / Reasoning	o1, claude-opus, gemini-ultra	~$15.00+	🔴 ~100× more

⚠️ Always direct customers to their provider's current pricing page — these numbers change as models evolve. Use the table above for illustration only.

How Context Bloat Compounds Cost

Same number of requests — very different cost:

  Request with 1K tokens:
  └── Cost on a standard model: ~$0.0025

  Request with 10K tokens (full document sent):
  └── Cost on a standard model: ~$0.025  ← 10× more expensive

  500 such requests/day × 30 days:
  ├── 1K tokens:  ~$37.50/month
  └── 10K tokens: ~$375.00/month  ← same traffic, 10× the bill

  FIX: Send only relevant chunks. Use retrieval (RAG).
       Summarize long docs with a cheap model before
       passing to an expensive one.

Diagnosis Flow

Customer reports high bill
  │
  ├── Dashboard → Usage → set date range to billing period
  │     │
  │     ├── Single-day spike visible?
  │     │     → Likely runaway loop or retry storm.
  │     │       Are requests clustered by timestamp?
  │     │       Clustered      → retry storm (no exponential backoff)
  │     │       Spread but massive volume → runaway loop (code bug)
  │     │
  │     ├── Usage shifted to a more expensive model mid-month?
  │     │     → Model swap.
  │     │       Ask: "Did anyone on your team change the model name recently?"
  │     │
  │     ├── High token count per request?
  │     │     → Context bloat.
  │     │       Ask: "Are you sending full documents or just relevant sections?"
  │     │
  │     └── Usage spread evenly but higher overall?
  │           → Traffic grew OR dev key hitting production API.
  │             Ask: "Do you use the same API key in dev and production?"
  │
  └── Usage dashboard total matches the invoice?
        YES → Usage is legitimate. Explain pricing, suggest optimizations.
        NO  → Escalate with account ID, date range, and the discrepancy figures.

Post-Resolution Recommendations

Root Cause Found	Recommend
Runaway loop	Set a monthly hard spend limit. Add request-level logging.
Model swap	Lock model names to constants or environment variables. Review pricing on every model change.
Context bloat	Use retrieval-augmented generation (RAG). Send relevant chunks only.
Retry storm	Implement exponential backoff with jitter. Cap total retries per request.
Dev key in production	Separate API keys per environment. Set lower spend limits on dev keys.

Incident 3 — Spending Limit Reached Without Warning {#incident-3}

API stops mid-month — customer didn't realise a hard limit was set

What the customer says

"The API just stopped working. I have money in my account."
"I'm getting errors even though my balance is positive."
"It was fine yesterday — nothing changed."

What actually happened

Most AI providers allow users to set a monthly spending cap (hard limit). When this cap is reached, all API calls fail — even with a valid payment method and positive credit balance. This is a customer-configured safety feature, not a bug. The confusion usually happens because:

The limit was set a long time ago and forgotten
Usage grew beyond the original projection
A spike consumed the monthly budget faster than expected
The customer confused the soft limit (notification only) with the hard limit (cutoff)

Soft Limit vs Hard Limit — The Critical Difference

	Soft Limit	Hard Limit
What it does	Sends an email/alert notification when reached	Stops all API calls immediately when reached
Does it cut off the API?	No — API keeps working	Yes — API stops
Error seen when hit	No error — just a notification	429 or billing-related error
Best use	Early warning at 70–80% of budget	Circuit breaker at 100% of budget

How Limits Should Be Configured

Monthly budget: $500
  │
  ├── Soft limit: $375  (75%)
  │     → Notification sent: "You've used 75% of your budget"
  │     → API still works
  │     → Time to review: Is this expected? Should the limit be raised?
  │
  └── Hard limit: $500  (100%)
        → All API calls stop
        → Protects against runaway costs above the budget

  ┌──────────────────────────────────────────────────────┐
  │  $0          $375 (soft)          $500 (hard)         │
  │  ├──────────────┼───────────────────┤                 │
  │  │  SAFE ZONE   │   WARNING ZONE    │   API OFFLINE   │
  └──────────────────────────────────────────────────────┘

How to Fix It

Go to the provider's Billing → Spending limits settings and either:

Raise the hard limit to a higher value (takes effect immediately)
Wait for the monthly reset (usually the 1st of the calendar month)

⚠️ Before raising the limit: Check the Usage dashboard to confirm whether the spend was expected. If it's from a bug or spike, raising the limit without fixing the root cause just defers the problem.

What to Tell the Customer

"Your account has a monthly spending cap set, and you've reached it — that's why the API stopped. This is a safety feature you configured, not a bug. You can raise it in your billing settings. Before you do, I'd recommend checking your usage dashboard to confirm the spending was expected."

Incident 4 — Free Tier / Trial Credit Expiry {#incident-4}

Credits ran out or expired — customer didn't expect it

What the customer says

"I just created my account and the API is already not working."
"I thought I had free credits — why am I getting errors?"
"It worked last week. Now I'm getting 402 errors."

What actually happened

New accounts on most AI providers receive a free credit grant. These credits have two ways to disappear: fully consumed, or expired (credits carry a time limit). When they're gone, behaviour changes in ways that aren't always obvious.

Situation	Error Seen	Fix
Free credits fully consumed	`402` on all requests	Add a payment method in Billing settings
Free credits expired (time limit hit)	`402` even if credits appeared available	Credits are gone — add a payment method
On free tier with very low rate limits	`429` even at low request volume	Add payment method to move to paid tier
Upgraded to paid but limits feel unchanged	`429` at low volume	Tier upgrades can take time to propagate — check current tier in dashboard

Free Tier vs Paid Tier — Why It Feels Broken

Tier	Who	Rate Limits	Notes
Free	New accounts, no payment method	Very restrictive (e.g. 3 RPM on premium models)	Fine for testing; not suitable for real applications
Paid Tier 1	Payment method added + minimum spend reached	Significantly higher	Most developers land here first
Paid Tier 2+	Based on cumulative spend history	Progressively higher	Limits increase automatically as spend grows

⚠️ Common confusion: A customer adds a payment method but still hits very low rate limits. Most providers require a payment method AND a minimum spend AND a minimum account age — all three conditions must be met before a tier upgrade is applied.

What to Tell the Customer

"Your free credits have been used up or have expired. To continue, add a payment method in your billing settings. Once you meet the provider's tier criteria — typically a minimum spend and account age — you'll automatically move to a higher rate limit tier."

Incident 5 — Refund Request for Accidental Usage {#incident-5}

Customer was charged for usage they say was unintentional

What the customer says

"I had a bug that made thousands of API calls — can I get a refund?"
"My account was compromised and someone used my API key."
"I forgot to turn off my dev environment."

The Refund Policy Reality — Set Expectations Early

Most AI providers have a no-refund policy for API usage because the compute was actually consumed. There is no automatic refund process. That said, some situations may qualify for a goodwill credit. Being honest with customers before they escalate saves everyone time.

Situation	Realistic Outcome	What Helps the Customer's Case
Bug caused a clear runaway loop	🟠 Possible goodwill credit	Application logs, timestamps, request IDs, evidence it was unintentional
Account compromised / key stolen	🟢 Usually resolved in customer's favour	Report immediately. Show usage inconsistent with normal activity (IPs, models, times).
Provider outage caused excessive retries	🟢 Usually credited	Reference the outage from the provider's status page with matching timestamps
Didn't realise a model was expensive	🔴 Very unlikely	Pricing is publicly listed
"Forgot to cancel" / dev env left running	🔴 Unlikely	This is what spend limits are for

How to Handle the Ticket

Customer submits refund request
  │
  ├── Evidence of account compromise?
  │     → YES: Flag as security incident.
  │             Ask customer to rotate API key immediately.
  │             Collect: unusual IPs, models used, timestamps.
  │             Escalate to security / trust & safety team.
  │
  ├── Matching provider outage at that time?
  │     → YES: Cross-reference with provider's status page.
  │             If confirmed, credit is likely appropriate. Escalate to billing team.
  │
  ├── Clear code bug with log evidence?
  │     → Collect: timestamps, request IDs, total requests vs. normal baseline.
  │       Escalate to billing team with evidence.
  │       Do NOT promise a refund — only the billing team can approve.
  │
  └── No clear evidence / "I just forgot"?
        → Empathise but set expectations honestly.
          Recommend: hard spend limit + auto-recharge threshold.
          Offer to help configure it.

Information to Collect Before Escalating

Info Needed	Why
Account / Org ID	Identifies the account for the billing team
Date range of charges in question	Narrows the investigation window
Request IDs if available	Allows billing team to trace exact usage
Description of what went wrong (customer's words)	Establishes intent and context
Supporting logs or screenshots	Evidence for goodwill consideration

🔴 Never promise a refund. Only the billing team can approve credits. Promising what you can't deliver creates a worse outcome than being upfront from the start.

What to Tell the Customer

"I understand this is frustrating. The general policy is that API usage is non-refundable since the compute was consumed, but I'll escalate this to our billing team with the details you've shared. They'll review it and follow up. In the meantime, I'd recommend setting a monthly spend limit so this can't happen again — I can walk you through that now if you'd like."

Incident 6 — Account Suspension {#incident-6}

Account locked due to policy violation or fraud flag

What the customer says

"My account was suddenly disabled. I didn't do anything wrong."
"I'm getting 401 errors on a key that worked yesterday."
"I got an email saying my account violated usage policies but I don't understand why."

Why Accounts Get Suspended

Suspension Type	Common Triggers	Who Handles It
Automated — Policy violation	Usage patterns matching prohibited use cases, abuse detection	Trust & Safety team
Automated — Fraud flag	Suspicious payment method, unusual signup signals, sanctioned region	Trust & Safety / Finance
Manual — Policy violation	Reported abuse, investigation-triggered review	Trust & Safety team
Manual — Outstanding balance	Invoice not paid after repeated reminders	Finance / Billing team

Diagnosis Flow

Customer reports account suspended / 401 on all keys
  │
  ├── Can the customer log into the provider dashboard?
  │     │
  │     ├── Login WORKS but API fails
  │     │     → NOT an account suspension.
  │     │       This is a key-level issue.
  │     │       → Treat as Incident 1 or investigate API key directly.
  │     │
  │     └── Login FAILS
  │           → Account-level suspension confirmed. Continue below.
  │
  ├── Did the customer receive a suspension email?
  │     ├── YES — policy violation notice
  │     │     → Route to Trust & Safety.
  │     │       Do NOT reinstate at support level.
  │     │       Do NOT share what triggered the automated system.
  │     │
  │     ├── YES — payment / fraud notice
  │     │     → Outstanding invoice? Route to Finance.
  │     │       Fraud flag?          Route to Trust & Safety.
  │     │
  │     └── NO email received
  │           → Check internally if account is flagged.
  │             Could also be a key issue rather than true suspension.
  │
  └── Customer wants to appeal?
        → Direct to provider's official support/appeal process.
          Do NOT bypass or pre-approve reinstatement at support level.

What You Can and Cannot Do

	Support Engineer CAN	Support Engineer CANNOT
Policy suspension	Confirm suspension, route to T&S, explain appeal process	Reinstate the account, share what triggered the suspension
Fraud flag	Confirm status, collect info, route to correct team	Lift the fraud flag, process reinstatement
Outstanding invoice	Confirm invoice exists, direct to payment, route to Finance	Waive the amount, manually reinstate

🔴 Do not reinstate suspended accounts at the support level. All reinstatements for policy or fraud-related suspensions must go through Trust & Safety. Bypassing this process creates liability.

What to Tell the Customer

"I can see your account has been suspended. I've escalated this to the appropriate team for review. You can also submit a formal appeal through the provider's support portal — include your account ID and a description of your use case. The team will review and respond. I'm not able to share details of what triggered the review, but the appeals team will have full context."

Master Decision Tree

Start here for every billing or account ticket. The error code is the most reliable entry point.

Billing or account ticket received
  │
  ├── STEP 1: Check the provider's status page
  │     Active incident? → Inform customer, monitor, close when resolved.
  │     No incident?     → Continue.
  │
  ├── STEP 2: What error is the customer seeing?
  │     │
  │     ├── 402 Payment Required
  │     │     ├── Balance $0 or card failed?    → Incident 1 (Payment Failure)
  │     │     └── Hard spending limit reached?  → Incident 3 (Spending Limit)
  │     │
  │     ├── 401 Unauthorized
  │     │     ├── Account suspended?            → Incident 6 (Account Suspension)
  │     │     └── Key issue (no suspension)?    → API key troubleshooting
  │     │
  │     ├── 403 Forbidden
  │     │     └── Free tier / model access?     → Incident 4 (Free Tier Expiry)
  │     │
  │     ├── No specific error / vague report
  │     │     ├── "Bill too high"               → Incident 2 (Unexpected High Bill)
  │     │     ├── "Want a refund"               → Incident 5 (Refund Request)
  │     │     └── "Account locked"              → Incident 6 (Account Suspension)
  │     │
  │     └── 429 Too Many Requests
  │           → NOT a billing issue.
  │             See the Rate Limits Runbook.
  │
  └── STEP 3: After resolution
        → Send post-resolution recommendation (see each incident section above)
        → Log case notes: incident type, root cause, fix applied

✅ Support Engineer Troubleshooting Checklist {#checklist}

Work through this top to bottom for every billing or account ticket.

🔍 Step 1 — Initial Triage

[ ] Check the provider's status page for active incidents — stop here if one exists
[ ] Get the exact HTTP status code from the customer's logs (402, 401, 403, 429)
[ ] Get the exact error message from the response body (e.g. "insufficient_quota", "invalid_api_key")
[ ] Confirm the Account / Org ID (found in provider dashboard → Settings → Organization)
[ ] Get timestamp of last successful request and first failed request

💳 Step 2 — Billing Dashboard Check

[ ] Check payment method status — any red banners or declined payments?
[ ] Check credit balance — is it $0? Is auto-recharge enabled?
[ ] Check spending limits — has the hard limit been reached this month?
[ ] Check account tier — Free / Paid Tier 1 / Higher? Does it match what the customer expects?
[ ] Check for outstanding invoices (enterprise / invoice-billed accounts)

📊 Step 3 — Usage Investigation (for high-bill tickets)

[ ] Open Usage dashboard for the billing period in question
[ ] Look for a single-day spike — note the date
[ ] Filter by model — did usage shift to a more expensive model mid-month?
[ ] Check tokens per request — high count = context bloat
[ ] Confirm usage dashboard total matches invoice total — discrepancy? Escalate with both figures

🔒 Step 4 — Account Status Check (for 401 / suspension tickets)

[ ] Can the customer log into the provider dashboard? Login works but API fails = key issue, not suspension
[ ] Did the customer receive a suspension email? Policy violation? Fraud flag? Outstanding balance?
[ ] Verify the API key is organisation-level, not a personal key from a departed team member
[ ] For suspension: route to Trust & Safety — do NOT reinstate at support level

📋 Step 5 — Resolution & Close-out

[ ] Confirm API is working again before closing the ticket
[ ] Send the appropriate post-resolution recommendation based on root cause
[ ] Add case notes: incident type, root cause, fix applied, recommendation given
[ ] If escalated: confirm escalation was received with a follow-up timeline set for the customer

⚠️ Always — Safety & Escalation Rules

[ ] Never ask for a full API key — if the customer sends one, tell them to rotate it immediately
[ ] Never promise a refund — only the billing team can approve credits
[ ] Never reinstate a suspended account at the support level — all reinstatements go through Trust & Safety

A general troubleshooting reference for support engineers working with AI API providers. Patterns apply across providers — OpenAI, Anthropic, Google, Cohere, and others follow similar billing models.

⚡ Quick Reference

Contents

Incident 1 — Payment Failure / Credit Exhaustion {#incident-1}

What the customer says

What actually happened

Diagnosis Flow

How to fix it

What to tell the customer

Incident 2 — Unexpected High Bill {#incident-2}

What the customer says

What actually happened

Planned vs Unplanned Model Changes — Know the Difference

Smart Multi-Model Routing — Recommended Approach

The Pricing Gap That Catches People Off Guard

How Context Bloat Compounds Cost

Diagnosis Flow

Post-Resolution Recommendations

Incident 3 — Spending Limit Reached Without Warning {#incident-3}

What the customer says

What actually happened

Soft Limit vs Hard Limit — The Critical Difference

How Limits Should Be Configured

How to Fix It

What to Tell the Customer

Incident 4 — Free Tier / Trial Credit Expiry {#incident-4}

What the customer says

What actually happened

Free Tier vs Paid Tier — Why It Feels Broken

What to Tell the Customer

Incident 5 — Refund Request for Accidental Usage {#incident-5}

What the customer says

The Refund Policy Reality — Set Expectations Early

How to Handle the Ticket

Information to Collect Before Escalating

What to Tell the Customer

Incident 6 — Account Suspension {#incident-6}

What the customer says

Why Accounts Get Suspended

Diagnosis Flow

What You Can and Cannot Do

What to Tell the Customer

Master Decision Tree

✅ Support Engineer Troubleshooting Checklist {#checklist}

🔍 Step 1 — Initial Triage

💳 Step 2 — Billing Dashboard Check

📊 Step 3 — Usage Investigation (for high-bill tickets)

🔒 Step 4 — Account Status Check (for 401 / suspension tickets)

📋 Step 5 — Resolution & Close-out

⚠️ Always — Safety & Escalation Rules