By now you have no doubt read a lot of bad coverage about the Crowdstrike update that crashed more than 8M Windows systems around the world. We have been treated to all sorts of action photos of the “blue screen of death” showing up in airports, across the exterior of the Vegas Sphere, and other places that have interactive displays that aren’t interacting with anything. All because of a bad virus definitions file that was put out there for little more than a hour on Friday. This file was placed in a very sensitive area of the Windows OS (I will get to that in a moment) and because it was poorly written and inadequately tested.
I say virus definitions file because that is really what it is. Crowdstrike released this after-action report that is filled with doublespeak, jargon, and a tremendous lack of clarity earlier today. What is interesting from a corporate comms perspective is that they explain things in detail that we don’t need and ignore things that we really want to know. What they don’t say can be found on Kevin Beaumont’s blog. Here is what I have gleaned from the sad affair:
- Almost every anti-malware, endpoint detection and remediation, SASE whatever you wanna call ’em vendor has a similar configuration of their products. They have to have this unfettered access to the Windows kernel because that is how they work, and have worked since the early days of Norton Anti-Virus running on DOS. So just because you use some other vendor’s product doesn’t mean this can’t happen to you.
- These vendors have to update their software frequently to stay ahead of exploits that they find, and this means that they are sending out stuff fairly frequently.
- This means that the chances of a badly crafted file can be created in the rush to get these updates out. It was a small miracle that the file was only online for little more than an hour.
- Beaumont says that we are “handing the keys to the global economy to a small group of private cybersecurity companies with no external governance or assurance. It has always felt sketchy, and today feels very sketchy.” I would agree.
- The incident shows how badly managed Crowdstrike’s devops processes were and how few checks and testing procedures they had in place.
- The incident now has informed all sorts of bad actors how they can cause machines to crash by inserting a file in a similar place. Not a great situation, to be sure.
- If there ever was a time to seriously implement Zero Trust protocols, that time was last week.
Crowdstrike’s blog states some promises on how this won’t happen again. I wish I could believe them. I wish I could also believe those vendors who they compete with won’t have something similar happen, only next time it will be initiated by a bad actor and not just sloppy coders employed by the security vendors themselves.
Great expository writing by Beaumont. Thank you for the link.
Is it a coincidence that the same George Kurtz, now head of CrowdStrike, was CTO for McAfee when a McAfee AV update borked millions of XP computers in 2010? You’d think he would have learned from the McAfee disaster and put in place stringent test procedures and phased rollout of CrowdStrike updates. Apparently not.
Will there be any big lawsuits coming?
David – you omitted two very big issues.
First is the give-the-update-to-everyone at the same time approach. This is the work of rank amateurs who do not deserve to have the jobs they do. Anything that can crash the OS needs to be released in drips and drabs, not all at once. Blame for this falls on both Crowdstrike and their customers. Even if Crowdstrike has no facility for delaying updates, a client that is on the ball will block the updates to most of their machines until it has been allowed to run for a while on some small subset of their machines.
Second is the fact that the Crowdstrike software crashed. Like the school that little Bobby Tables went to, the software clearly did not validate its inputs.