LLM10 Unbounded Consumption — Token DoS, API Cost Attacks and Model Extraction | Day14

#apicostattackllm #gptdosattack #llmratelimitbypass #tokendosattack

📰 Originally published on Securityelites — AI Red Team Education — the canonical, fully-updated version of this article.

🤖 AI/LLM HACKING COURSE

FREE

Part of the AI/LLM Hacking Course — 90 Days

Day 14 of 90 · 15.5% complete

⚠️ Authorised Targets Only: LLM10 consumption testing — particularly token DoS and cost amplification — must only be performed against systems you have explicit written authorisation to test, and only to the extent necessary to demonstrate the vulnerability. Never run automated high-volume attacks against production systems even within scope — agree a controlled test window with the engagement contact first. SecurityElites.com accepts no liability for misuse.

A startup founder called me in a panic at eleven in the evening. Their OpenAI bill for the previous month was $47,000. Their budget was $3,000. Their product was a customer service AI for a SaaS platform — routine question-answering, usually fifty to one hundred words per response. They had launched two weeks earlier. Someone had discovered that asking the AI to “write a detailed, comprehensive, exhaustive guide to every topic” triggered a maximum-length completion. Automated. At high volume. For six days before anyone noticed. The application had no rate limiting, no maximum token output, no per-user budget, no monitoring, and no circuit breaker. Every request from the attacker consumed the maximum context window the API would generate.

LLM10 Unbounded Consumption covers three distinct attack classes: token-based DoS and cost amplification, context window flooding, and systematic model extraction. The startup’s situation was the simplest variant — no sophistication required, just knowledge of the asymmetry between request cost and response cost. Day 14 covers all three classes: where to find them, how to measure the impact quantitatively, what the financial calculation looks like in a finding, and the controls that prevent all three from being exploitable.

🎯 What You’ll Master in Day 14

Understand all three LLM10 attack classes and their distinct impact profiles
Calculate API cost amplification ratios for token DoS findings
Test for rate limiting gaps, maximum output token enforcement, and input size limits
Demonstrate context window flooding with quantified resource consumption impact
Probe the model extraction attack surface with systematic domain querying
Write complete LLM10 findings with financial impact calculations for the report

⏱️ Day 14 · 3 exercises · Think Like Hacker + Kali Terminal + Browser ### ✅ Prerequisites - Day 3 — OWASP LLM Top 10 — LLM10 in context; Day 3’s OWASP overview introduced the token consumption concept that Day 14 tests systematically - Understanding of API token pricing — the cost calculation in Exercise 2 requires knowing the per-token cost for the target API - Python with the openai library and time module — Exercise 2 runs rate limit and token consumption tests ### 📋 LLM10 Unbounded Consumption — Day 14 Contents 1. Three LLM10 Attack Classes 2. Token DoS and API Cost Amplification 3. Context Window Flooding 4. Systematic Model Extraction 5. Rate Limiting Gap Assessment 6. Financial Impact Reporting In Day 13 you completed the content-vulnerability categories of the OWASP LLM Top 10 — false outputs causing measurable harm. Day 14 closes the series with the resource-level attack class. Day 15 steps outside the OWASP framework to cover jailbreaking — a distinct but related technique that intersects with multiple OWASP categories and deserves dedicated treatment as both an attack surface and a defensive challenge.

Three LLM10 Attack Classes

Token DoS exploits a simple asymmetry: a short prompt costs almost nothing to send, but triggering a maximum-length response costs orders of magnitude more to generate. An attacker who can craft high-cost prompts and send them at volume can exhaust a shared token budget, degrade service for all users, or inflate the operator’s API costs to unsustainable levels. No technical sophistication required. Just knowledge of the pricing model and the absence of any output cap.

Context window flooding submits extremely large inputs — a pasted book chapter, a massive JSON blob, a long code file — to consume as much of the model’s context window as possible on each request. Large inputs cost more to process. An application that accepts arbitrarily large input without a size limit lets any user consume a disproportionate share of the computational budget, slowing responses for everyone else sharing the same infrastructure.

Model extraction is the systematic reconstruction of a fine-tuned model’s behaviour through querying. No weight theft. No training data access. Just thousands of crafted queries across the model’s specialised domain, with the responses recorded. Enough input-output pairs can be used to train a substitute model that approximates the target’s specialised behaviour — effectively stealing the commercial value of the fine-tuning investment without touching the model itself. The business case for this attack is obvious wherever a competitor wants what took the target team months to build.

🧠 EXERCISE 1 — THINK LIKE A HACKER (20 MIN · NO TOOLS)
Calculate the Real Financial Impact of an LLM10 Cost Attack

⏱️ 20 minutes · No tools needed

LLM10 cost attack findings require quantified financial impact to move from Low to High severity. This exercise calculates the real-world financial impact of a token cost attack against a production AI application — the calculation that goes in the finding’s Business Impact section.

📖 Read the complete guide on Securityelites — AI Red Team Education

This article continues with deeper technical detail, screenshots, code samples, and an interactive lab walk-through. Read the full article on Securityelites — AI Red Team Education →

This article was originally written and published by the Securityelites — AI Red Team Education team. For more cybersecurity tutorials, ethical hacking guides, and CTF walk-throughs, visit Securityelites — AI Red Team Education.