AW and AK2 flew to LAX Thursday evening. When they arrived, they had trouble picking up the rental car they reserved, and their hotel couldn't program their room key. I hope they're able to get home on their scheduled flight in a few days.
AW and AK2 flew to LAX Thursday evening. When they arrived, they had trouble picking up the rental car they reserved, and their hotel couldn't program their room key. I hope they're able to get home on their scheduled flight in a few days.
maschinenbau said:How did you spend your Clownstrike Day? After twiddling my thumbs for two hours at the office, here's what I got up to.
I knew I should've left my work computer on overnight .
My work Windows PC was peacefully hibernating through the whole update mess, so I had no excuse to not work.
Crowdstrike was the title sponsor of the 24h of Spa and the CEO was there racing in the Mercedes.
Every time the Crowdstrike Mercedes was mentioned all I could think of was this.
GameboyRMH said:A thought I keep having:
Decades ago, people used to watch movies like Star Wars Ep. 1 where robot armies or other huge systems are taken out by blowing up a single facility and say things like "that's ridiculous, why would anyone make a huge system like that with a single central point of failure?"
Apparently the answer is "because it's a very profitable way to half-ass something."
Software and hardware on the mothership is how we took down the aliens in Independence Day.
I thought I was not going to be affected by this Crowdstrike mess but I was wrong:
I'm never going to recover from this!
/s
I spent most of my Friday remediating machines affected by this Unfortunately most of my users were off Friday so I expect Monday to suck as well
porschenut said:Should we all be changing our PC OS to a LINUX type system?
Yes, although I'm not sure how related that is to this incident 🐧
z31maniac said:GameboyRMH said:Gary said:I have such negative thoughts about Crowdstrike that I can't even express them right now without shutting down this thread. All I can say is emails and Seth Rich.
All I have to say about that is Rolling Stone, September 2019.
Don't be obtuse and tell us.
I want to know
This is tinfoil hat stuff directly related to politics, so if like I did you have roughly no idea what all this is about, you could consider yourself fortunate for not having your brain cells exposed to this nonsense and safely ignore the issue...but if you really want to risk them, you can DM me and see how deep this brain-damaged rabbit hole goes.
I was in Ft Myers for business all week, and got a front row seat to the mess Friday morning.
United did pretty well with it, honestly. They were very open and clear about what was going on and why we were not leaving on time.
My flight out of Ft Myers was *only* about an hour late, and in Houston we had to wait quite a while for flight crew who was on another late flight. We sat a bit longer waiting for people on connections that were close. No fun for those who missed their Denver connections though.
All in all, I was a solid 5 hours late getting home and genuinely relieved to have actually made it.
Turns out that this software was also causing a few Linux kernel panics before the disaster with the Windows kernel hit. If that had actually been a widespread problem it could've taken out more important servers, although unlike Windows, Linux OSes generally have multiple kernels installed that can be selected at boot independently of any full-disk encryption, so recovery would've been relatively quick and simple:
https://www.theregister.com/2024/07/21/crowdstrike_linux_crashes_restoration_tools/
In reply to johndej :
Crazy that I found out because of this that someone I know has had to work directly with the CEO a few years ago for maybe 6 months. His opinion is that the CEO won't care at all even if he has to leave. His ego is supposedly coated in Teflon when it comes to professional critiques. So if he goes, just more time to play.
Also Microsoft's released a special-purpose boot drive image to fix this issue: https://techcommunity.microsoft.com/t5/intune-customer-success/new-recovery-tool-to-help-with-crowdstrike-issue-impacting/ba-p/4196959
Ran across news confirming that the kernel module/driver files weren't updated and the crash was caused by just an update to the definition files, although they're more complicated than typical AV definitions:
https://www.theregister.com/2024/07/23/crowdstrike_failure_shows_need_for/
Interesting how this affected my workplace. The initial report was that many things aren't working properly but computers were not crashing.
It seems that only some employees that use public facing computer software started to have crashes. Not sure why? Maybe different computer setup?
Then mass notification that no further action needed unless contacted by ITSupport.
But then I did get a popup asking me to confirm deletion of some files.
This company's put together an estimate of the damage, $5.4B among Fortune 500 companies:
https://www.parametrixinsurance.com/in-the-news/crowdstrike-to-cost-fortune-500-5-4-billion
That's $10.8M per company on average, which is on the lower end of the ransomware payment ballpark for a big company, although ransomware attacks cause their own downtime as well.
Ran across news from the class-action shareholder lawsuit against Crowdstrike that they did have some kind of testing in place and the ill-fated update was tested in it, but this system somehow incorrectly validated the update as safe, even though a computer that attempted to load the new files would have failed to complete Windows startup:
More details have come out, it sounds like they had a lot of automated unit testing in place but no full, live testing of the software before it was sent out to production. Each component passed its unit tests, but the unit tests didn't account for how 2 different components could interact with the updated template instance set (which has a similar role to the definition file in most AV software) in a way that would cause a crash, so the first time the complete latest version of the software was given a practical test - on a huge number of customer PCs limited only by the capacity of their update servers and the timing of client update checks - the disaster was already underway.
https://www.theregister.com/2024/08/07/crowdstrike_full_incident_root_cause_analysis/
So testing was done, but it was all theoretical rather than practical testing with no complete VM or sandbox built. Am I reading that right?
I can't believe they didn't use a full testing suite and only unit-tested individual components, unless it was intentional. I don't want to go down that rabbit hole, but failing to do end-to-end testing before a release shouldn't happen with a company of that size and the level that their software operates at within the system.
You'll need to log in to post.