The Knowing Machine

Apr 13

Vulnerabilities are dense. For a long time, this was not known, and caused a lot of strife, although the busy practitioners remediating millions of vulnerabilities on the front lines probably had less time to think about the philosophical questions of security than we like to think. They had a lot of work to do.

They are dense in the way Jonathan Spring proved from first principles at Carnegie Mellon, connecting vulnerability enumeration to the halting problem. Dense in the way frontier AI models have now confirmed empirically, surfacing thousands of novel, exploitable zero-days across the most audited codebases on the planet.

Dan Geer posed the right question at Black Hat in 2014: are vulnerabilities sparse or dense? If sparse, you can patch your way to safety. If dense, patching without prioritization is the myth of Sisyphus, but the boulder is growing in size. Eleven years later, writing with Dave Aitel in Lawfare, he conceded the terms and moved to the next problem. What the industry needs, he argued, is a "knowing machine" that converts hazards into risks, not a "pointing machine" that merely enumerates flaws and screams equally at all of them. They write:

“Even children learn early to point and name—but knowing the word “dog” doesn’t reveal whether the animal might bite. In cybersecurity, we’ve built systems that similarly point and name vulnerabilities without understanding whether they’re truly dangerous. By embracing AI solely for pattern recognition, we’ve created a powerful “pointing machine” that identifies possible threats but does not comprehend their actual impact. What we need instead is a “knowing machine,” capable of understanding how code functions within complex, real-world environments, recognizing not just hazards but the full context of how and whether those hazards might become genuine risks.”

Anthropic's security team’s recent post on preparing for AI-accelerated offense recommended EPSS by name as the way to turn thousands of open CVEs into a manageable queue. It told defenders to plan for an order-of-magnitude increase in finding volume.

We built EPSS and maintain it. It is free, it covers every published CVE, and more than 120 vendors embed it today. Version 5 ships soon with expanded exploitation telemetry and improved calibration. It remains the best global baseline available for predicting which vulnerabilities will see exploitation activity in the next 30 days.

Yet EPSS is a global model. It knows what attackers are doing across the internet. It does not know what is running in a given environment, what is reachable, or what controls are in place. The Empirical Global Model adds near-real-time internet exploitation telemetry across more than 17,700 exploited CVEs, roughly ten times the coverage of the CISA KEV catalog. The Local Model goes further still: it trains on an organization's own asset and application data, attack telemetry, environmental context and controls, and remediation history, then measures its performance against observed exploitation.

A CVSS 9.8 carrying a 97% probability of exploitation and a CVSS 9.8 carrying a 0.1% probability are not the same animal. As Geer and Aitel’s Lawfare piece explains, density makes that failure of discrimination lethal, because the cost of treating every critical finding as urgent scales linearly with volume while the actual threat does not.

The pointing-machine era has been long over. The knowing machine is a probability model. The empirical data about density is here.

Michael Roytman

The Knowing Machine

Anthropic Is Right About EPSS. That Still Leaves the Hard Part.

The 500 Organization Reality Check