DEV Community

Shaiful Islam Shabuj
Shaiful Islam Shabuj

Posted on

Github's new AI Credit Usage Report : ~3hours cost 7000 AI credits plus $10 USD overage

copilot usage

Github's new AI Credit Usage Report — DevLoop Session

Session: waymark-devloop-controlled

Date: 2026-06-03 (22:18–01:14 KST / 13:18–16:14 UTC)

Model: claude-sonnet-4.6

Extra cost incurred: ~$10 USD overage

Devloop: Custom multi-agent AI development pipeline for software development tasks, built on top of Copilot and Claude.

Teststop: A tool that runs the full test suite and returns results as context for the next assistant turn using Claude.

Summary: This report analyzes the root causes of the unexpectedly high AI credit usage during a single DevLoop session, which ran for ~3 hours and consumed approximately 36.4 million tokens, leading to a ~$10 overage charge on top of the monthly Copilot plan.


Total Token Consumption

Metric Value
Input tokens 35,952,200
Output tokens 451,307
Total tokens 36,403,507 (~36.4M)
User messages sent 7
Assistant events generated 870
Session duration ~3 hours

Tool Call Breakdown

Tool Calls Impact
bash 766 Command output added to context each time
view 233 Large source files read into context
task (subagents) 32 Each returned 500–2,000 lines of output
read_agent 31 Full reviewer/architect output injected back
read_bash 36 Worker output read back into context
grep 21 File search results added to context
glob 16 File listings added to context

Root Causes (Ranked by Impact)

🔴 1. Context Snowball — ~80% of tokens

Copilot re-sends the entire accumulated conversation history on every assistant turn. With 870 events over 3 hours, the context grew exponentially:

  • Turn 1: ~10K tokens in context
  • Turn 100: ~100K tokens in context
  • Turn 870: millions of tokens re-sent every time

A single never-reset session running 10+ architect → implement → review → fix loops caused this.

🔴 2. Verbose Subagent Output Injected Back (32 task calls)

Each @devloop-architect and @devloop-reviewer subagent returned 500–2,000 lines of output, which was read back via read_agent (31 calls) and added permanently to the growing context.

Reviewer verdicts alone averaged ~200 lines each across 10+ reviews.

🟡 3. 766 Bash Calls

Every devloop work, devloop fix, npm run build, git command, and teststop run returned output that was kept in the ever-growing context window.

🟡 4. Large Source Files Read 233 Times

Files like database.ts (3,000+ lines) and server.ts (1,300+ lines) were repeatedly read via view calls and added to context.

🟡 5. Long Session Summary at Start

The session started with a ~300-line prior checkpoint summary injected into the system prompt, re-sent on every single one of the 870 turns.


Tasks Completed During This Session

Task Result Approx. Loops
Session rollback races ✅ APPROVED 9/10 3 fix loops
EventTopic build fix ✅ Direct commit
Config trailing newline trim ✅ Direct commit
Slack dedup key fix ✅ Direct commit
Token approve-after-reject ✅ Direct commit
PID reuse race (TASK-20260604-004255) ✅ APPROVED 9/10 1 fix loop
Hub stop inflight guard (TASK-20260604-004538) 🔄 Fix in progress 1+ fix loops

Why It Cost ~$10 Extra

Claude Sonnet 4.6 pricing is approximately $3 per million input tokens.

At 35.9M input tokens:

  • Estimated gross cost: ~$108 token-equivalent
  • Covered by Copilot plan: monthly included quota
  • Overage charged: ~$10 USD (tokens beyond plan limit billed pay-as-you-go)

How to Prevent This in Future Sessions

Action Savings Estimate
Reset session every 2–3 tasks (/clear or new session) 60–70% reduction
Truncate reviewer output — verdict + bullet points only, not 200-line reports 15–20% reduction
Don't read_agent full output — check verdict line only, skip verbose body 10% reduction
Run teststop earlier — catch failures before running multiple fix loops Fewer loops = less context
Avoid reading large files repeatedly — cache content in session state 5–10% reduction

One-Line Root Cause

DevLoop ran 10+ architect → implement → review → fix loops in a single never-reset 3-hour session, causing the context to balloon to 36.4 million tokens and exceed the monthly Copilot plan quota. [planning to work for the devloop improvements]

Based on above usage, there are also place of improvements for copilot cli too.

Specially heavily token usage alert
Auto context compact for repetitive context

Tools:

Top comments (0)