XOOMAR

Posted on Jun 30 • Originally published at xoomar.com

AI Token Costs Threaten to Break Cybersecurity Budgets

#aitokencosts #cybersecurity #agenticai #soc

Recent industry discussion around agentic AI in cybersecurity points to an uncomfortable tradeoff: the technology can be useful, even impressive, while the AI token costs behind source-code analysis, alert triage, and autonomous investigation can become too expensive to trust at full blast during a live security incident.

The argument, laid out by Danelle Au in SecurityWeek, is uncomfortable for buyers and vendors alike. Agentic AI can accelerate detection and response, but if security teams treat usage pricing as an afterthought, they may build SOC workflows they can’t afford to run when the pressure spikes.

Agentic AI Will Fail in Cybersecurity If AI Token Costs Stay Hidden

The cybersecurity industry is selling speed: faster detection, autonomous investigation, agentic response. Fine. But speed with a hidden meter is a budget trap.

SecurityWeek frames the risk around a familiar SOC pressure point: a high-severity investigation that pushes AI agents to map timelines, check logs, scan threat intelligence, and hunt for lateral movement across multiple systems. The concern is not a documented outage message; it is that consumption limits or cost controls can appear mid-investigation.

The real failure mode may be quiet: API timeouts, degraded model quality, or workflows that simply stop firing on lower-priority alerts.

That is worse. A loud failure gets escalated. A silent one creates blind spots.

Every AI Security Investigation Now Has a Meter Running

Traditional machine learning in security does not consume tokens. It works through statistical matrices and behavioral baselines, with costs measured in compute rather than model input and output.

Generative AI changes the cost profile, but in a bounded way. A human asks for an incident summary or translation layer, the model responds, then waits. Usage is tied to human pacing.

Agentic AI removes that pacing. Give it a goal, such as “determine if this server is compromised,” and it can run a multi-step loop: call APIs, parse logs, evaluate payloads, write follow-up queries, and feed that context back into the model again and again.

Security AI layer	Typical role	Cost behavior
Machine learning	Continuous detection and baselines	No token cost
Generative AI	Summaries, explanations, analyst assistance	Bounded by human prompts
Agentic AI	Autonomous investigation and response loops	Token use can grow fast

That last row is where AI token costs become a security design problem. Better reasoning often requires more context. More context means more input tokens. More autonomous follow-up means more output tokens. Accuracy and cost start pulling against each other.

Security Teams Can’t Budget for AI With Per-Seat Thinking

Enterprise security software has long leaned on predictable metrics: seats, endpoints, devices, or subscriptions. Frontier AI pricing works differently.

SecurityWeek notes that model providers charge per token, roughly three-quarters of a word. Claude Sonnet 4.6 costs $3.00 per million input tokens and $15.00 per million output tokens. GPT-5.5 runs $5.00 per million input tokens and $30.00 per million output tokens.

Those numbers sound small until the workflow starts looping. SecurityWeek gives the rough shape:

Alert triage: about 1,000 tokens
Guided investigation: about 20,000-50,000 tokens per incident
Fully autonomous agentic loop: potentially millions of tokens in minutes

The CISO budgeting mistake is treating this like another user license. It behaves more like cloud infrastructure spending, except the demand trigger is an incident queue, not a planned workload.

A small team with heavy telemetry and messy escalations can burn more inference than a larger team with cleaner data flows. That is XOOMAR’s core read: token budgets must map to investigations, alerts, log volume, escalation patterns, and response depth, not just headcount.

For readers tracking adjacent XOOMAR coverage on security operations pressure, see AI Threats Push Apple Security Updates Into Overdrive and Russian Signal Phishing Hijacks VIP Accounts in Support Scam.

Cheap AI Triage Can Become Expensive AI Noise

SecurityWeek says LLM API prices dropped roughly 80% between early 2025 and early 2026. That is good news, but not enough.

Lower unit prices can encourage teams to push more alerts into AI workflows. That creates a dangerous illusion: if automation handles the work, the work must be cheap. It isn’t.

False positives do not become harmless because an agent reads them. They still consume tokens. They still trigger API calls. They still create summaries, queries, and follow-up reasoning. If escalated, they still land back on an analyst’s desk.

This is where security teams need discipline before they add agents everywhere.

Data hygiene: Don’t feed garbage into expensive reasoning loops.
Alert tuning: Don’t let low-value alerts become high-cost investigations.
Workflow design: Reserve agentic depth for cases that justify the spend.
Telemetry scoping: Give the model enough context, not every scrap of text by default.

The vendor pitch says agentic AI can move at machine speed. True. So can the bill.

Deployment Choices Will Decide Whether AI Security Scales or Stalls

SecurityWeek draws a sharp line between cloud-based AI and on-premises architectures.

Cloud-based security AI can pass volatile model costs directly to customers. Every reasoning loop, API call, and multi-agent step runs on outside infrastructure at outside pricing. That may work for bounded assistant features. It gets harder when agentic AI runs continuously.

On-premises architectures, according to SecurityWeek, address this with fixed local compute that can execute complex reasoning loops without a token meter running in the background. Au’s argument is blunt: for organizations that need agentic AI “continuously at full depth,” on-premises is the only architecture that makes the economics work.

The strongest caveat is that the source material supports the economic distinction, not a universal architecture verdict. Latency, compliance, maintenance, and model performance require vendor-specific review. Buyers should demand that review before pilots become production dependencies.

Retrofitting cost controls after a SOC starts relying on agents is painful. By then, every limit looks like a security compromise.

The Case for Paying More When AI Stops Real Attacks

Here is the counterargument, and it deserves respect: if agentic AI catches intrusions faster, helps analysts investigate complex incidents, and reduces time wasted on manual correlation, higher token bills may be justified.

In high-value security work, expensive AI usage can still make sense if the output prevents material damage. A costly investigation looks reckless only if it produces no value. If the findings help stop a breach, expose serious weaknesses, or reduce response time, the math changes.

The same logic applies inside the SOC. A single contained intrusion can justify a lot of spend. Security budgets are not supposed to be cheap. They are supposed to reduce risk.

But strategic spending and unchecked spending are different animals. A model that delivers value should be funded. A workflow that burns tokens without tying costs to true positives, contained incidents, or analyst time saved should be cut back.

The issue is not that AI costs money. The issue is whether CISOs can prove what the money bought.

CISOs Need Token Budgets, Kill Switches, and Cost-Aware Detection Metrics

SecurityWeek warns that vendors are likely to package consumption into AI credits or “operations.” That may smooth contract language, but it can also obscure the real meter. A credit bundle is still consumption pricing with a softer name.

CISOs should push back now. At minimum, every AI security platform should expose:

Token budgets by workflow: Triage, investigation, response, reporting, and threat hunting should not share one opaque pool.
Spending alerts: Security leaders need warnings before agentic loops hit critical limits.
Rate limits and kill switches: Expensive autonomous tasks should be stoppable without disabling the whole platform.
Model selection rules: Not every task needs the most expensive frontier model.
Approval paths: Deep autonomous investigation should have escalation logic when costs surge.
Outcome metrics: Track cost per investigation, cost per true positive, cost per contained incident, and cost per analyst hour saved.

Vendors should stop hiding behind vague credit systems. Show token accounting. Show pricing scenarios. Show how the platform behaves when limits are reached. Say whether it downgrades models, stalls workflows, skips lower-priority alerts, or asks for approval.

The next buying cycle for AI security should not start with a demo. It should start with a stress test: a burst of alerts, messy telemetry, and a demand for full accounting.

In cybersecurity, the most dangerous AI tool may be the one your team can’t afford to keep running during the attack that matters. CISOs should make token economics part of incident readiness now, before the meter becomes the weakest control in the room.

Impact Analysis

Hidden token costs can make AI-driven security workflows unaffordable during high-pressure incidents.
Silent failures such as degraded model quality or skipped low-priority alerts can create dangerous blind spots.
Security teams need to evaluate AI pricing and consumption limits before relying on agentic systems in live response.

Originally published on XOOMAR. For more news and analysis, visit XOOMAR.

Top comments (1)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.