DEV Community

Sindhu Murthy
Sindhu Murthy

Posted on

Billing & Account Issues: A Support Engineer's Runbook

Who this is for: This runbook is a practical reference for support engineers and anyone preparing for a support engineering role with AI API providers. It covers the 6 most common billing incident types — how to diagnose them, how to fix them, and what to communicate to customers. Patterns here apply across providers including OpenAI, Anthropic, Google, Cohere, and others.


⚡ Quick Reference

Match the customer's symptom to the incident type, then jump to that section.

Customer Says Jump To
🚫 "My API calls suddenly stopped working" Incident 1 — Payment Failure / Credit Exhaustion
😱 "My bill is way higher than I expected" Incident 2 — Unexpected High Bill
📍 "I hit my limit and the API stopped" Incident 3 — Spending Limit Reached
⏳ "My free credits ran out" Incident 4 — Free Tier / Trial Expiry
💸 "I want a refund for accidental charges" Incident 5 — Refund Request
🔒 "My account has been suspended / locked" Incident 6 — Account Suspension

🔵 Before anything else: Always check the provider's status page first (e.g. status.openai.com, status.anthropic.com). If there is an active incident, that is your answer — inform the customer and monitor. Do not proceed further until you have ruled out a provider-side outage.


Contents

  1. Incident 1 — Payment Failure / Credit Exhaustion
  2. Incident 2 — Unexpected High Bill
  3. Incident 3 — Spending Limit Reached Without Warning
  4. Incident 4 — Free Tier / Trial Credit Expiry
  5. Incident 5 — Refund Request for Accidental Usage
  6. Incident 6 — Account Suspension
  7. Master Decision Tree
  8. Support Engineer Troubleshooting Checklist

Incident 1 — Payment Failure / Credit Exhaustion {#incident-1}

Error: 402 Payment Required — API access stops immediately

What the customer says

  • "My API calls were working fine and then suddenly stopped."
  • "I'm getting 402 errors on every request."
  • "Nothing in my code changed."

What actually happened

AI API providers stop access immediately and without a grace period when a payment fails or a prepaid credit balance hits $0. Unlike a SaaS subscription that might give you days to fix a payment issue, the API cuts off the moment the billing system flags a failure.

Root Cause How to Confirm It
Card expired or declined Provider dashboard → Billing → red banner or failed payment status
Prepaid credit balance at $0 Dashboard → Billing → credit balance shows $0.00
Auto-recharge enabled but card declined Dashboard → Billing → payment history → failed recharge entry
Invoice overdue (enterprise accounts) Dashboard → Billing → Invoices → unpaid invoice

Diagnosis Flow

402 Payment Required
  │
  ├── Provider Dashboard → Billing section
  │     │
  │     ├── Red banner or "Payment failed" message?
  │     │     └── Card expired / declined
  │     │           FIX: Ask customer to update card.
  │     │               Dashboard → Billing → Payment methods → Update
  │     │               API resumes within ~5 minutes of payment clearing.
  │     │
  │     ├── Credit balance shows $0?
  │     │     ├── Auto-recharge OFF
  │     │     │     FIX: Add credits manually + enable auto-recharge.
  │     │     │
  │     │     └── Auto-recharge ON but balance still $0
  │     │           → The recharge itself failed (card issue).
  │     │             FIX: Same as expired/declined card above.
  │     │
  │     └── Invoice overdue? (enterprise customers)
  │           FIX: Route to finance team for payment processing.
  │
  └── Dashboard looks fine — balance > $0, card valid?
        → Rare sync delay. Wait 10 minutes.
          Still failing? Escalate with account ID + timestamps.
Enter fullscreen mode Exit fullscreen mode

How to fix it

Fix Where Time to Resolution
Update payment card Dashboard → Billing → Payment methods ~5 min after payment clears
Add prepaid credits Dashboard → Billing → Add credits ~2–5 minutes
Enable auto-recharge Dashboard → Billing → Auto-recharge settings Prevents future incidents

What to tell the customer

"Your payment method needs to be updated. Go to your provider dashboard under Billing → Payment methods, update your card, and API access should resume within a few minutes. I'd also recommend enabling auto-recharge so your balance never hits zero unexpectedly."

💚 Post-resolution: Always recommend enabling auto-recharge with a top-up threshold set to at least 2× the customer's average daily spend. This single setting prevents the majority of these tickets.


Incident 2 — Unexpected High Bill {#incident-2}

Customer's invoice is significantly higher than expected

What the customer says

  • "My bill last month was $40. This month it's $800. Nothing changed."
  • "I think I'm being charged incorrectly."
  • "We only have 200 users — how is this possible?"

What actually happened

Something in the customer's usage changed — even if they don't know what. In practice, 95% of high-bill tickets trace back to one of five root causes. Your job is to identify which one using the Usage dashboard.

Root Cause What It Looks Like in Usage Dashboard How Common
Runaway loop — bug calling API thousands of times One day with a massive spike, thousands of requests in minutes Very common
Model swap — switched to a more expensive model Usage shifts to a pricier model mid-month Very common
Context bloat — sending full documents instead of chunks High token count per request, not high request count Common
Retry storm — failed requests retrying without backoff Clusters of identical requests at the same timestamps Common
Dev key in production — test environment hitting real API Usage spikes during business hours or CI/CD run times Moderate

Planned vs Unplanned Model Changes — Know the Difference

Using multiple models intentionally for different tasks is one of the best cost strategies in AI engineering — not a problem. The issue is when a model change happens accidentally: a developer swaps a model name in one place without checking the pricing impact, and the bill spikes before anyone notices.

✅ Intentional Multi-Model Routing ❌ Accidental Model Swap
What it is Deliberately using cheap models for simple tasks, expensive ones for complex tasks Someone changes a model name in code without checking pricing
Planned? Yes — documented in architecture No — discovered on the invoice
Is it a problem? No — this is best practice Yes — surprise bill with no warning
Example Classification → economy model; complex reasoning → premium model gpt-4o-mini quietly changed to gpt-4o in a config file

Smart Multi-Model Routing — Recommended Approach

Task Type Recommended Model Tier Why
Classification, routing, tagging, simple Q&A Economy (e.g. gpt-4o-mini, claude-haiku, gemini-flash) Doesn't need deep reasoning
Customer-facing chat, summarisation Standard (e.g. gpt-4o, claude-sonnet, gemini-pro) Good quality-to-cost balance
Complex analysis, code, legal/financial reasoning Premium (e.g. o1, claude-opus, gemini-ultra) Worth the cost when accuracy matters

The Pricing Gap That Catches People Off Guard

Model Tier Examples Approx. Cost per 1M input tokens Relative Cost
Economy / Lightweight gpt-4o-mini, claude-haiku, gemini-flash ~$0.10–0.20 🟢 Cheapest
Standard gpt-4o, claude-sonnet, gemini-pro ~$2.50–3.00 🟠 ~15–20× more
Premium / Reasoning o1, claude-opus, gemini-ultra ~$15.00+ 🔴 ~100× more

⚠️ Always direct customers to their provider's current pricing page — these numbers change as models evolve. Use the table above for illustration only.

How Context Bloat Compounds Cost

Same number of requests — very different cost:

  Request with 1K tokens:
  └── Cost on a standard model: ~$0.0025

  Request with 10K tokens (full document sent):
  └── Cost on a standard model: ~$0.025  ← 10× more expensive

  500 such requests/day × 30 days:
  ├── 1K tokens:  ~$37.50/month
  └── 10K tokens: ~$375.00/month  ← same traffic, 10× the bill

  FIX: Send only relevant chunks. Use retrieval (RAG).
       Summarize long docs with a cheap model before
       passing to an expensive one.
Enter fullscreen mode Exit fullscreen mode

Diagnosis Flow

Customer reports high bill
  │
  ├── Dashboard → Usage → set date range to billing period
  │     │
  │     ├── Single-day spike visible?
  │     │     → Likely runaway loop or retry storm.
  │     │       Are requests clustered by timestamp?
  │     │       Clustered      → retry storm (no exponential backoff)
  │     │       Spread but massive volume → runaway loop (code bug)
  │     │
  │     ├── Usage shifted to a more expensive model mid-month?
  │     │     → Model swap.
  │     │       Ask: "Did anyone on your team change the model name recently?"
  │     │
  │     ├── High token count per request?
  │     │     → Context bloat.
  │     │       Ask: "Are you sending full documents or just relevant sections?"
  │     │
  │     └── Usage spread evenly but higher overall?
  │           → Traffic grew OR dev key hitting production API.
  │             Ask: "Do you use the same API key in dev and production?"
  │
  └── Usage dashboard total matches the invoice?
        YES → Usage is legitimate. Explain pricing, suggest optimizations.
        NO  → Escalate with account ID, date range, and the discrepancy figures.
Enter fullscreen mode Exit fullscreen mode

Post-Resolution Recommendations

Root Cause Found Recommend
Runaway loop Set a monthly hard spend limit. Add request-level logging.
Model swap Lock model names to constants or environment variables. Review pricing on every model change.
Context bloat Use retrieval-augmented generation (RAG). Send relevant chunks only.
Retry storm Implement exponential backoff with jitter. Cap total retries per request.
Dev key in production Separate API keys per environment. Set lower spend limits on dev keys.

Incident 3 — Spending Limit Reached Without Warning {#incident-3}

API stops mid-month — customer didn't realise a hard limit was set

What the customer says

  • "The API just stopped working. I have money in my account."
  • "I'm getting errors even though my balance is positive."
  • "It was fine yesterday — nothing changed."

What actually happened

Most AI providers allow users to set a monthly spending cap (hard limit). When this cap is reached, all API calls fail — even with a valid payment method and positive credit balance. This is a customer-configured safety feature, not a bug. The confusion usually happens because:

  • The limit was set a long time ago and forgotten
  • Usage grew beyond the original projection
  • A spike consumed the monthly budget faster than expected
  • The customer confused the soft limit (notification only) with the hard limit (cutoff)

Soft Limit vs Hard Limit — The Critical Difference

Soft Limit Hard Limit
What it does Sends an email/alert notification when reached Stops all API calls immediately when reached
Does it cut off the API? No — API keeps working Yes — API stops
Error seen when hit No error — just a notification 429 or billing-related error
Best use Early warning at 70–80% of budget Circuit breaker at 100% of budget

How Limits Should Be Configured

Monthly budget: $500
  │
  ├── Soft limit: $375  (75%)
  │     → Notification sent: "You've used 75% of your budget"
  │     → API still works
  │     → Time to review: Is this expected? Should the limit be raised?
  │
  └── Hard limit: $500  (100%)
        → All API calls stop
        → Protects against runaway costs above the budget

  ┌──────────────────────────────────────────────────────┐
  │  $0          $375 (soft)          $500 (hard)         │
  │  ├──────────────┼───────────────────┤                 │
  │  │  SAFE ZONE   │   WARNING ZONE    │   API OFFLINE   │
  └──────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

How to Fix It

Go to the provider's Billing → Spending limits settings and either:

  • Raise the hard limit to a higher value (takes effect immediately)
  • Wait for the monthly reset (usually the 1st of the calendar month)

⚠️ Before raising the limit: Check the Usage dashboard to confirm whether the spend was expected. If it's from a bug or spike, raising the limit without fixing the root cause just defers the problem.

What to Tell the Customer

"Your account has a monthly spending cap set, and you've reached it — that's why the API stopped. This is a safety feature you configured, not a bug. You can raise it in your billing settings. Before you do, I'd recommend checking your usage dashboard to confirm the spending was expected."


Incident 4 — Free Tier / Trial Credit Expiry {#incident-4}

Credits ran out or expired — customer didn't expect it

What the customer says

  • "I just created my account and the API is already not working."
  • "I thought I had free credits — why am I getting errors?"
  • "It worked last week. Now I'm getting 402 errors."

What actually happened

New accounts on most AI providers receive a free credit grant. These credits have two ways to disappear: fully consumed, or expired (credits carry a time limit). When they're gone, behaviour changes in ways that aren't always obvious.

Situation Error Seen Fix
Free credits fully consumed 402 on all requests Add a payment method in Billing settings
Free credits expired (time limit hit) 402 even if credits appeared available Credits are gone — add a payment method
On free tier with very low rate limits 429 even at low request volume Add payment method to move to paid tier
Upgraded to paid but limits feel unchanged 429 at low volume Tier upgrades can take time to propagate — check current tier in dashboard

Free Tier vs Paid Tier — Why It Feels Broken

Tier Who Rate Limits Notes
Free New accounts, no payment method Very restrictive (e.g. 3 RPM on premium models) Fine for testing; not suitable for real applications
Paid Tier 1 Payment method added + minimum spend reached Significantly higher Most developers land here first
Paid Tier 2+ Based on cumulative spend history Progressively higher Limits increase automatically as spend grows

⚠️ Common confusion: A customer adds a payment method but still hits very low rate limits. Most providers require a payment method AND a minimum spend AND a minimum account age — all three conditions must be met before a tier upgrade is applied.

What to Tell the Customer

"Your free credits have been used up or have expired. To continue, add a payment method in your billing settings. Once you meet the provider's tier criteria — typically a minimum spend and account age — you'll automatically move to a higher rate limit tier."


Incident 5 — Refund Request for Accidental Usage {#incident-5}

Customer was charged for usage they say was unintentional

What the customer says

  • "I had a bug that made thousands of API calls — can I get a refund?"
  • "My account was compromised and someone used my API key."
  • "I forgot to turn off my dev environment."

The Refund Policy Reality — Set Expectations Early

Most AI providers have a no-refund policy for API usage because the compute was actually consumed. There is no automatic refund process. That said, some situations may qualify for a goodwill credit. Being honest with customers before they escalate saves everyone time.

Situation Realistic Outcome What Helps the Customer's Case
Bug caused a clear runaway loop 🟠 Possible goodwill credit Application logs, timestamps, request IDs, evidence it was unintentional
Account compromised / key stolen 🟢 Usually resolved in customer's favour Report immediately. Show usage inconsistent with normal activity (IPs, models, times).
Provider outage caused excessive retries 🟢 Usually credited Reference the outage from the provider's status page with matching timestamps
Didn't realise a model was expensive 🔴 Very unlikely Pricing is publicly listed
"Forgot to cancel" / dev env left running 🔴 Unlikely This is what spend limits are for

How to Handle the Ticket

Customer submits refund request
  │
  ├── Evidence of account compromise?
  │     → YES: Flag as security incident.
  │             Ask customer to rotate API key immediately.
  │             Collect: unusual IPs, models used, timestamps.
  │             Escalate to security / trust & safety team.
  │
  ├── Matching provider outage at that time?
  │     → YES: Cross-reference with provider's status page.
  │             If confirmed, credit is likely appropriate. Escalate to billing team.
  │
  ├── Clear code bug with log evidence?
  │     → Collect: timestamps, request IDs, total requests vs. normal baseline.
  │       Escalate to billing team with evidence.
  │       Do NOT promise a refund — only the billing team can approve.
  │
  └── No clear evidence / "I just forgot"?
        → Empathise but set expectations honestly.
          Recommend: hard spend limit + auto-recharge threshold.
          Offer to help configure it.
Enter fullscreen mode Exit fullscreen mode

Information to Collect Before Escalating

Info Needed Why
Account / Org ID Identifies the account for the billing team
Date range of charges in question Narrows the investigation window
Request IDs if available Allows billing team to trace exact usage
Description of what went wrong (customer's words) Establishes intent and context
Supporting logs or screenshots Evidence for goodwill consideration

🔴 Never promise a refund. Only the billing team can approve credits. Promising what you can't deliver creates a worse outcome than being upfront from the start.

What to Tell the Customer

"I understand this is frustrating. The general policy is that API usage is non-refundable since the compute was consumed, but I'll escalate this to our billing team with the details you've shared. They'll review it and follow up. In the meantime, I'd recommend setting a monthly spend limit so this can't happen again — I can walk you through that now if you'd like."


Incident 6 — Account Suspension {#incident-6}

Account locked due to policy violation or fraud flag

What the customer says

  • "My account was suddenly disabled. I didn't do anything wrong."
  • "I'm getting 401 errors on a key that worked yesterday."
  • "I got an email saying my account violated usage policies but I don't understand why."

Why Accounts Get Suspended

Suspension Type Common Triggers Who Handles It
Automated — Policy violation Usage patterns matching prohibited use cases, abuse detection Trust & Safety team
Automated — Fraud flag Suspicious payment method, unusual signup signals, sanctioned region Trust & Safety / Finance
Manual — Policy violation Reported abuse, investigation-triggered review Trust & Safety team
Manual — Outstanding balance Invoice not paid after repeated reminders Finance / Billing team

Diagnosis Flow

Customer reports account suspended / 401 on all keys
  │
  ├── Can the customer log into the provider dashboard?
  │     │
  │     ├── Login WORKS but API fails
  │     │     → NOT an account suspension.
  │     │       This is a key-level issue.
  │     │       → Treat as Incident 1 or investigate API key directly.
  │     │
  │     └── Login FAILS
  │           → Account-level suspension confirmed. Continue below.
  │
  ├── Did the customer receive a suspension email?
  │     ├── YES — policy violation notice
  │     │     → Route to Trust & Safety.
  │     │       Do NOT reinstate at support level.
  │     │       Do NOT share what triggered the automated system.
  │     │
  │     ├── YES — payment / fraud notice
  │     │     → Outstanding invoice? Route to Finance.
  │     │       Fraud flag?          Route to Trust & Safety.
  │     │
  │     └── NO email received
  │           → Check internally if account is flagged.
  │             Could also be a key issue rather than true suspension.
  │
  └── Customer wants to appeal?
        → Direct to provider's official support/appeal process.
          Do NOT bypass or pre-approve reinstatement at support level.
Enter fullscreen mode Exit fullscreen mode

What You Can and Cannot Do

Support Engineer CAN Support Engineer CANNOT
Policy suspension Confirm suspension, route to T&S, explain appeal process Reinstate the account, share what triggered the suspension
Fraud flag Confirm status, collect info, route to correct team Lift the fraud flag, process reinstatement
Outstanding invoice Confirm invoice exists, direct to payment, route to Finance Waive the amount, manually reinstate

🔴 Do not reinstate suspended accounts at the support level. All reinstatements for policy or fraud-related suspensions must go through Trust & Safety. Bypassing this process creates liability.

What to Tell the Customer

"I can see your account has been suspended. I've escalated this to the appropriate team for review. You can also submit a formal appeal through the provider's support portal — include your account ID and a description of your use case. The team will review and respond. I'm not able to share details of what triggered the review, but the appeals team will have full context."


Master Decision Tree

Start here for every billing or account ticket. The error code is the most reliable entry point.

Billing or account ticket received
  │
  ├── STEP 1: Check the provider's status page
  │     Active incident? → Inform customer, monitor, close when resolved.
  │     No incident?     → Continue.
  │
  ├── STEP 2: What error is the customer seeing?
  │     │
  │     ├── 402 Payment Required
  │     │     ├── Balance $0 or card failed?    → Incident 1 (Payment Failure)
  │     │     └── Hard spending limit reached?  → Incident 3 (Spending Limit)
  │     │
  │     ├── 401 Unauthorized
  │     │     ├── Account suspended?            → Incident 6 (Account Suspension)
  │     │     └── Key issue (no suspension)?    → API key troubleshooting
  │     │
  │     ├── 403 Forbidden
  │     │     └── Free tier / model access?     → Incident 4 (Free Tier Expiry)
  │     │
  │     ├── No specific error / vague report
  │     │     ├── "Bill too high"               → Incident 2 (Unexpected High Bill)
  │     │     ├── "Want a refund"               → Incident 5 (Refund Request)
  │     │     └── "Account locked"              → Incident 6 (Account Suspension)
  │     │
  │     └── 429 Too Many Requests
  │           → NOT a billing issue.
  │             See the Rate Limits Runbook.
  │
  └── STEP 3: After resolution
        → Send post-resolution recommendation (see each incident section above)
        → Log case notes: incident type, root cause, fix applied
Enter fullscreen mode Exit fullscreen mode

✅ Support Engineer Troubleshooting Checklist {#checklist}

Work through this top to bottom for every billing or account ticket.


🔍 Step 1 — Initial Triage

  • [ ] Check the provider's status page for active incidents — stop here if one exists
  • [ ] Get the exact HTTP status code from the customer's logs (402, 401, 403, 429)
  • [ ] Get the exact error message from the response body (e.g. "insufficient_quota", "invalid_api_key")
  • [ ] Confirm the Account / Org ID (found in provider dashboard → Settings → Organization)
  • [ ] Get timestamp of last successful request and first failed request

💳 Step 2 — Billing Dashboard Check

  • [ ] Check payment method status — any red banners or declined payments?
  • [ ] Check credit balance — is it $0? Is auto-recharge enabled?
  • [ ] Check spending limits — has the hard limit been reached this month?
  • [ ] Check account tier — Free / Paid Tier 1 / Higher? Does it match what the customer expects?
  • [ ] Check for outstanding invoices (enterprise / invoice-billed accounts)

📊 Step 3 — Usage Investigation (for high-bill tickets)

  • [ ] Open Usage dashboard for the billing period in question
  • [ ] Look for a single-day spike — note the date
  • [ ] Filter by model — did usage shift to a more expensive model mid-month?
  • [ ] Check tokens per request — high count = context bloat
  • [ ] Confirm usage dashboard total matches invoice total — discrepancy? Escalate with both figures

🔒 Step 4 — Account Status Check (for 401 / suspension tickets)

  • [ ] Can the customer log into the provider dashboard? Login works but API fails = key issue, not suspension
  • [ ] Did the customer receive a suspension email? Policy violation? Fraud flag? Outstanding balance?
  • [ ] Verify the API key is organisation-level, not a personal key from a departed team member
  • [ ] For suspension: route to Trust & Safety — do NOT reinstate at support level

📋 Step 5 — Resolution & Close-out

  • [ ] Confirm API is working again before closing the ticket
  • [ ] Send the appropriate post-resolution recommendation based on root cause
  • [ ] Add case notes: incident type, root cause, fix applied, recommendation given
  • [ ] If escalated: confirm escalation was received with a follow-up timeline set for the customer

⚠️ Always — Safety & Escalation Rules

  • [ ] Never ask for a full API key — if the customer sends one, tell them to rotate it immediately
  • [ ] Never promise a refund — only the billing team can approve credits
  • [ ] Never reinstate a suspended account at the support level — all reinstatements go through Trust & Safety

A general troubleshooting reference for support engineers working with AI API providers. Patterns apply across providers — OpenAI, Anthropic, Google, Cohere, and others follow similar billing models.

Top comments (0)