Accurate Representation
Written with Max Voldman of Whistleblower Partners
For most of the last decade, the question we have spent our research time on at Empirical Security is some version of the same one. Given a vulnerability, what is the probability it will be exploited? Given an organization, how much risk does it carry? Given a remediation strategy, how much better does it perform than the alternatives? These are measurement questions. They have answers, even if the answers are imperfect.
The question we want to walk through in this post is different. It is a measurement question dressed up as a legal one, and nobody in either community has tried to answer it with data. Here is the question: when a government contractor self-assesses its cybersecurity posture against NIST SP 800-171, generates a score, and uploads that score to the Supplier Performance Risk System, how far is that score from the truth? And what does that gap cost?
Why this question, and why now
For years, cybersecurity in federal contracting was an operational discipline with prevailing self-assessments where the consequences of falling short were measured in audit findings. But, now the government is focused on contractor fraud regarding cybersecurity. In 2021, the Department of Justice launched an initiative to target such cases via the False Claims Act (FCA), a law that allows whistleblowers to sue in the name of the US and share any recoveries. The DOJ has reiterated these goals many times since, including after the change in administrations. The DOJ recovered roughly $52 million in cybersecurity-related FCA settlements in fiscal year 2025 alone, more than triple the prior year.
The point of these numbers is not the dollar figure, it is the mechanism. None of the cybersecurity FCA cases that have settled required proof of a data breach (and most did not even involve allegations of one). They turned on misrepresentations made to the government: a contractor said it was compliant with a set of measurable controls. It was not. The gap between the statement and the reality was the basis for liability.
If you have followed our work, you will recognize this as a familiar shape of problem. Self-reporting and measurable ground truth diverge and that divergence is unevenly distributed.
The part that is actually new
At the intersection of the data and the law is the big question: what happens when a contractor adopts the kind of probabilistic, risk-based vulnerability management that we have spent a decade arguing for, and then gets sued under the FCA? The argument cuts both ways.
The defensive argument is straightforward. A contractor running EPSS or a similar probabilistic model can demonstrate that it triaged rationally. It directed a limited remediation budget toward the vulnerabilities most likely to be exploited. It achieved high coverage of exploited vulnerabilities at low effort. It can show, with auditable data, that its definition of "adequate security" was data-driven rather than checklist-driven. If a court evaluates the standard as a risk management standard, this could prove a strong defence.
The offensive argument is harder to escape. The FCA has a knowledge requirement (sometimes called “scienter”). Mere mistakes or negligence are not punishable under the law, but that is not the case when it comes to knowing misrepresentations to the government. A contractor that runs EPSS, sees a vulnerability on a CUI-handling system with a 15% exploitation probability, logs the score, decides not to remediate, and then signs a CMMC affirmation stating it is compliant is walking a fine line. It has produced a documented, quantified decision to accept a known risk on a government system. A whistleblower with access to the dashboard has an excellent knowledge argument if those risks were honestly shared with the government.
So: does better data make you safer, or does it make your noncompliance clearer to prosecutors?
We do not know yet. The specific case law is not settled, though there are clear risks. The compliance regime was not designed with probabilistic risk models in mind. But adopting a threshold, sticking to it and being clear and upfront about what risk tolerance threshold and which model an organization is using, is far more defensible than a blanket “yes”.
What we are proposing
Define "adequate security" as a measurable, outcome-based standard rather than a binary control-implementation checklist. The CMMC affirmation should be backed by observable telemetry, not by a senior official's signature on a self-generated score. Compliance is measured against the same data the security team is already collecting.
This collapses the paradox. When the risk model is the compliance standard, running the model and demonstrating compliance are the same act. The dashboard that today might be evidence of deliberate ignorance becomes, instead, the evidence of compliance.
This is not a complete proposal. It does not address the personnel, physical, and policy controls that are harder to measure empirically. It will require coordination between DOJ, contracting agencies, and the contractor community. We do not think it solves everything. We do think it is a better starting point than the honor system that has produced the current enforcement curve.