The Flaw of Averages in Averaging Flaws
Share with Your Network
Dr. Sam Savage became well known when he coined a mainstream term for a concept stats nerds had been familiar with for a long time – the Flaw of Averages. Here’s how he explains it:
“Consider the case of the statistician who drowns while fording a river that he calculates is, on average, three feet deep. If he were alive to tell the tale, he would expound on the “flaw of averages,” which states, simply, that plans based on assumptions about average conditions usually go wrong.”
Seems absurd, but if you know what to look for, I guarantee you’ll see the flaw of averages crop up all the time. As Dr. Savage states, “This basic but almost always unseen flaw shows up everywhere in business, distorting accounts, undermining forecasts, and dooming apparently well-considered projects to disappointing results.”
So, does it show up in vulnerability management? You betcha. In fact, the flaw of averages quickly reared its head as we tried to lay out the average, or typical, sequence of events surrounding security flaws in the 6th volume of our Prioritization to Prediction research series. I’ll briefly walk through what we learned so that nobody drowns in a river of vulnerabilities that’s, on average, 3ft deep.
First, some context. We collected dates for key milestones in the lifecycle of 473 vulnerabilities with known exploits in the wild. Those milestones include:
- CVE published: CVE officially added to the CVE List along with the relevant details.
- CVE reserved: Status assigned when the vuln’s validated but details aren’t yet public.
- Exploitation in the wild: Detected attempts to exploit vulns in organizational assets.
- Exploit code: Proof of concept or a working exploit is publicly released.
- Patch available: The vendor has fixed the vulnerability and released a patch.
- Vuln scanners: First identification of a vuln’s existence in organizational assets.
If the ordering of those milestones strikes you as odd, that’s because those bullets are listed in alphabetical order. But what’s the proper, or typical sequence of these events? To answer that, we calculated the median (no, not the average) number of days between the first event and each subsequent event for all the vulnerabilities in our sample. The figure below reveals what we found.
The first public record of a vulnerability’s existence almost always occurs in the form of a CVE being reserved. That record doesn’t generally contain much information, but it does get things rolling. Exploitation of that vulnerability in the wild typically comes last in the sequence of events. The average ordering of what happens between those milestones is shown in the figure, but before you start quoting stats—it’s flawed.
In reality, the ordering of events in the vulnerability lifecycle isn’t nearly so…well, orderly. Below you’ll find the top 10 sequences we observed across all flaws in our sample. Only 16% of those we studied followed the most common sequence shown at the top. That by itself tells you a lot about the “average” vulnerability lifecycle—namely that it doesn’t exist. By the time we get down to the last pattern shown, we’re left with barely over 2% of vulnerabilities—and there are still over 100 unique milestone sequences below that!
At this point, it should be clear that there’s really no such thing as the average flaw lifecycle. Attempts like Figure 1 to boil vulnerability disclosure, remediation, and exploitation down to a simple sequence miss the intricacies of what’s really going on.
But so what?! Why are we nitpicking variations in the ordering of events surrounding vulnerabilities? Well, because getting it wrong may make the difference between your vulnerability management program successfully fording thigh-deep through a river or drowning in a deep underwater pit. I’ll demonstrate why with a quick example.
Let’s say you look at something like Figure 1 and conclude “Awesome; I have a month to deploy a patch once it’s available before code exploiting the vuln is released. I’ll have the VM team put that on their calendar to get it done.” It’s logical reasoning based on the average order and timeframe of events…but you guessed it…it’s flawed. Figure 3 shows the reality.
Figure 3 traces the timeline of exploit code releases relative to patch availability for all the vulns in our sample. These two milestones represent the basic building blocks for attacker and defender workflows. Attackers add exploit code to their arsenal, and defenders jumpstart remediation through patches.
Per the chart, a quarter of vulnerabilities have exploit code published before a patch is made available and exploit production ramps up right around patch release. Furthermore, if you’re using patch releases to start the one-month-to-exploit clock ticking, you’re going to be late to the risk remediation party two thirds of the time. And there’s no such thing as fashionably late in this line of work.
If there’s one point the P2P series has hammered home over the last 3 years, it’s the absolute necessity of focusing remediation efforts on fixing the most risky flaws first. We’ve seen how that assessment of risk greatly depends on where a vulnerability is in its lifecycle and what’s likely to happen next. And because we can’t count on them to follow the same sequence of events every time, we rely on intelligence to guide our risk assessments and future predictions.
I’ll leave you with a stat to close this out. If you told me that exploit code for a vulnerability just dropped but the patch wasn’t yet available, I’d move up my predicted timetable of exploitation in the wild by 47 days. How can I make such a claim on just one piece of intel like that? Well, this is already a long blog post, so you’ll have to read P2P Vol. 6 to get the answer!
Dr. Wade Baker is the co-founder of the Cyentia Institute and in addition to his role with Cyentia, Wade is a professor in Virginia Tech’s College of Business, teaching in the MBA and Master of IT programs. He’s also proud to serve on the Advisory Boards of the RSA Conference and FAIR Institute.