DEV Community

Bala Paranj
Bala Paranj

Posted on

Predict, Don't Enumerate — But What About the Questions that Have Answers?

Michael Roytman's "Predict, Don't Enumerate" makes a well-argued case that the security industry's approach to vulnerability management is broken. Static severity scoring (CVSS) can't survive the volume of findings that AI-powered discovery is producing. EPSS — a predictive model that estimates the probability a vulnerability will be exploited in the next 30 days — is a better signal for triage. Anthropic endorsing it publicly matters because it makes a private consensus visible. The policy recommendations (rewrite SLAs by exploitation probability, change what the board sees, invest in telemetry feedback loops) are sound operational advice.

All of that is correct. If you're running a reactive vulnerability management program and triaging a backlog, EPSS is a better tool than CVSS for deciding what to fix first. You should use it.

The issue is what the article frames as the end state and what that framing makes invisible.

Knowing is not better predicting

The article borrows a distinction from Dan Geer and Dave Aitel: a "pointing machine" enumerates findings without understanding context; a "knowing machine" understands how code behaves in a particular environment and recognizes what turns a hazard into a risk. It then maps "knowing" onto prediction — EPSS as the clearest example of a knowing machine.

But prediction is not knowing. A prediction is a better-grounded guess. EPSS returns a probability — a statement about what attackers are likely to do across the internet in the next 30 days. It's useful. It's a better signal than a severity score. It is still a guess. A sophisticated, data-driven, continuously-updated guess. But a guess about attacker behavior, not knowledge of your system's state.

A knowing machine, in any honest reading of Geer and Aitel's distinction, would know whether a path exists from the internet to your customer database through your actual configuration. That's not a prediction. It's a deducible fact about a configuration graph. It's either there or it isn't, and determining which doesn't require a model trained on global exploitation patterns — it requires traversing the graph of your own infrastructure.

The article redefines "knowing" as "better predicting" and presents the evolution as complete: from pointing (enumerating by severity) to knowing (predicting by exploitation probability). But there's a third category the article never considers — verifying — and it answers a different kind of question entirely.

Two kinds of questions, two kinds of answers

The security questions a team faces aren't all the same kind of question, and they don't all have the same kind of answer.

Genuinely uncertain questions: Will this vulnerability be exploited in the next 30 days? Is this anomalous login a credential theft? Is this traffic pattern an attack? These depend on attacker behavior, which is unknowable in advance. The honest answer is a probability. Prediction is the right tool. EPSS belongs here.

Deducible questions: Does our configuration violate our own rules? Does a path exist from an unauthenticated entry point to sensitive data through our actual trust relationships? Does this IAM policy grant a combination of permissions that creates a privilege-escalation vector? These are fully determined by the configuration itself. The answer isn't a probability — it's a verdict. The path either exists or it doesn't. The policy either violates the rule or it doesn't. Determining which is a computation, not a forecast.

The article's framework sees only the first kind. Its entire pipeline — find vulnerabilities, predict which ones matter, fix those — is built for uncertain questions and gives useful answers to them. But it routes deducible questions through the same prediction pipeline, and that's where certainty gets thrown away.

A compound misconfiguration that creates an unauthenticated path to sensitive data is wrong regardless of whether EPSS gives it a high exploitation probability. It's wrong by construction, provably, from the configuration alone. If no attacker has exploited that class of misconfiguration in the last 30 days, EPSS assigns it a low score. The prediction model says it's low priority. The configuration says it's a breach waiting to happen. The prediction is correct about attacker behavior and silent about system state — because system state isn't what prediction models measure.

This is the gap. Not a flaw in EPSS. A category error in routing every security question through a prediction framework, including the questions that have definite answers.

The reactive assumption

The article's entire flow is reactive: scan → find → prioritize → remediate. The improvement it offers is better prioritization — react to the right things, in the right order, based on exploitation probability rather than severity scores. That's a real improvement to a reactive pipeline.

But it's still a reactive pipeline. The vulnerabilities exist in production. The findings pile up. The team triages. The question is always "which of these existing problems should we fix first?" — never "does our configuration satisfy the rules we declared before these problems existed?"

The proactive alternative — verify configuration against declared invariants before deployment, so violations never reach production to become findings in the first place is not considered. The article improves how fast and accurately you react without ever asking whether the reaction loop is the right architecture.

For the genuinely uncertain questions, a reactive loop is the only option. You can't proactively prevent an exploitation you can't predict. But for the deducible questions, the reactive loop is a choice, not a necessity. You can verify, before deployment, that your configuration doesn't create the compound paths, the permission escalations, the policy violations. You can catch them when they're configuration errors, not when they're findings in a backlog.

A team that verifies deducibly-wrong configurations before deployment and uses EPSS to prioritize the uncertain remainder has a smaller backlog, a higher signal-to-noise ratio on the findings that remain, and fewer tokens spent on remediating things that should never have existed. A team that routes everything through prediction treats the deducible and the uncertain identically, and pays for it in volume, triage time, and remediation churn.

Local context is deducible, not predictable

The article correctly identifies that global prediction isn't enough — you need local context: asset inventory, topology, reachability, deployed controls. Then it proposes training a local model on that context to produce enterprise-specific probabilities.

But most of what it lists as "local context" isn't uncertain. It's your own configuration. Asset inventory is a fact. Topology is a fact. Reachability through your trust relationships is a graph computation. Whether a control is deployed is a boolean. These aren't inputs to a prediction model — they're inputs to a deterministic verification. You don't need a model to tell you whether a path exists from the internet to your database through your network configuration. You need a graph traversal. The answer is a fact, not a forecast.

Training a model on your own configuration to predict your own risk is routing a deducible question through a prediction framework. It produces a probability where you could have had a verdict. The probability might be correct. But it's less than what was available — and less than what an auditor, a regulator, or a board should accept when the definitive answer was computable.

The article says "a scanner can't tell apart" two organizations with the same CVE but different exposure. That's true of a scanner that checks one setting at a time. It's not true of a tool that evaluates the full configuration graph — reachability, trust relationships, permission chains — deterministically. The distinction isn't between global prediction and local prediction. It's between prediction (however localized) and verification (however comprehensive).

The volume argument proves too much

The article's strongest argument is volume: AI-driven discovery is producing orders of magnitude more findings, the count will grow, and human-scale triage can't keep up. Therefore: predict, prioritize, and fix what matters.

This is correct for the uncertain portion of the backlog. But the volume argument also proves the case for verification, not just prediction — and the article doesn't notice.

If the volume of findings is growing exponentially, then the cost of triaging, prioritizing, and remediating them is also growing exponentially. Every finding in the backlog costs triage time, prediction-model compute, and remediation effort. A finding that was deducibly preventable — a configuration violation that verification would have caught before deployment — is a finding that should never have entered the backlog. Every such finding that does enter the backlog is triage time, prediction compute, and remediation effort spent on something that had a definitive answer and didn't need a probability.

Verification reduces the input to the prediction pipeline. Fewer deducibly-wrong configurations reach production → fewer findings enter the backlog → the prediction model's job gets smaller and its signal-to-noise ratio improves. Prediction and verification aren't alternatives. Verification makes prediction tractable at the volumes the article is worried about.

The article frames the future as "more findings, better prediction." The sustainable version is "fewer preventable findings (because verification caught them) and better prediction for the remainder (because the backlog is smaller and cleaner)." The first version scales cost linearly with findings. The second bends the curve.

The Vassilev connection the article should have made

The article cites Jonathan Spring's work tying vulnerability enumeration to the halting problem — for any sufficiently complex system, there are always more undiscovered flaws. That's correct research, correctly cited. But the article draws a narrow conclusion: since you can't enumerate all flaws, predict which ones matter.

Vassilev's NIST proof (June 9, 2026), extending Gödel's incompleteness theorems to AI systems, says prediction via finite rules is also incomplete. No finite model — no matter how well-trained, how localized, how continuously updated — catches everything. The prediction model has a ceiling for the same mathematical reason the enumeration model has a ceiling: both are finite systems operating over unbounded spaces.

The conclusion neither Spring nor Vassilev's proofs support is "so we need a better prediction model." The conclusion they do support is: since neither enumeration nor prediction can be complete, the system must assume its own incompleteness and compensate through continuous verification against declared properties — exactly Vassilev's prescribed "continuous-monitor-and-update" model.

Verification doesn't escape Gödel either — no finite set of invariants catches every misconfiguration. But each invariant it does check, it checks definitively. And the catalog grows. That's the architecture Vassilev prescribes: not universal robustness (formally unattainable) but continuous expansion that raises the cost of finding the next gap for the attackers. Vassilev's prescribed economic equilibrium is explicitly about making the cost of finding new exploits exceed the attacker's resources.

The direction that leads somewhere better

The article's recommendations are good operational advice for the reactive model. Here's what they look like when you add verification:

Rewrite the SLA — yes, by exploitation probability for the uncertain findings. But also add a verification tier: deducibly-wrong configurations that violate declared invariants get a different SLA entirely, because they don't need prediction — they have verdicts. A compound misconfiguration that creates a path to sensitive data is not "priority based on exploitation probability." It's a configuration violation, deterministically identified, with a known remediation.

Change what the board sees — yes, exploitability-weighted exposure for the prediction-managed portion. But also show verified posture: how much of our configuration is within declared invariants, where are the gaps, what percentage of the deducible risk surface is covered? That metric is deterministic, reproducible, and auditable — which is what boards and regulators want.

Invest in telemetry — yes, the feedback loop between prioritization and exploitation is essential for the prediction model. But also invest in the specification layer: the declared invariants that define what the configuration must satisfy. A prediction model without specifications triages findings. A verification engine with specifications prevents them.

Fix the compliance conversation — yes, move from severity to probability for the uncertain findings. But also offer the auditor something no probabilistic model can: a deterministic verdict that a specific configuration satisfies a specific rule, reproducible on demand, same input same output. Auditors don't want probabilities for questions that have answers. They want the answer.

The future isn't "predict, don't enumerate." It's verify what's deducible, predict what's uncertain, and know the difference. Prediction is the right tool for attacker behavior, which is unknowable in advance. Verification is the right tool for configuration posture, which is fully determined by your own infrastructure. Routing deducible questions through a prediction framework doesn't make the answer better. It makes the answer worse — a probability where a verdict was available. The teams that separate the two will have smaller backlogs, clear signal, and a posture they can prove rather than estimate.

For the visuals that makes the concepts clear: https://gist.github.com/sufield/a55bb7d2ace9b267f8972eb2260c6446


References: Roytman, "Predict, Don't Enumerate" (O'Reilly Radar, June 4, 2026); Vassilev, NIST/IEEE Security and Privacy (June 9, 2026); Geer and Aitel, "Pointing Machines vs Knowing Machines" (Lawfare, 2025); Spring, vulnerability enumeration and the halting problem (CMU SEI). If you think prediction covers the deducible questions adequately — that a probability is sufficient where a verdict is available — that's the specific disagreement worth having.

Top comments (0)