LEARN MORE

Fusion Risk Management announces new and enhanced Connector for AlertMedia to accelerate critical event response and recovery

fusion culture
Blog

CrowdStrike: The Latest High-Impact Scenario Breaking Down the Silo between IT and TPRM

July 20, 2024

What happened and why?

CrowdStrike is a global cybersecurity provider that facilitates endpoint security on Windows PCs and servers of thousands of firms across all industries and sectors. In the evening of Thursday 18th July (UTC) into the following Friday morning, they issued a faulty content update to their virus scanner ‘Falcon,’ which forced affected PCs and servers offline and into a Blue Screen of Death (BSOD) recovery boot loop. 

They appeared to identify the issue and reverted the faulty update, which should help to stop the spread, but it doesn’t resolve the issue for those machines already affected. IT admins have been discussing workaround steps, and, essentially, they need to boot affected machines in safe mode before navigating to the CrowdStrike directory and deleting the system file. 

This sounds simple, but for cloud-based servers and teams of remote workers, this is going to be quite time consuming, as it appears to require hands on keyboards and may leave potential security gaps having deleted the system file. Coincidentally, Microsoft is also working on issues with its Office 365 apps and services due to a “configuration change in a portion of [their] backend workloads.”1  

When you throw in extra factors like a heatwave and the busiest day for UK flight departures since October 2019 (according to Cirium) as well, you have the perfect storm. This one looks set to have cost CrowdStrike greatly, with approximately $12.5bn / 15% wiped off its value on the Nasdaq stock exchange overnight.  

Who was impacted?

Virtually everyone.  

Airlines and airports, banks and financial services firms (including payroll providers), retail firms, transport infrastructure, broadcasters, emergency services, governments, and telecomms and technology providers have all been impacted.  

This wasn’t like a cyberattack where only certain firms or industries are targeted, or a reputational crisis hitting a particular industry; this outage ran the whole spectrum of sectors, no organisation too big or too small. In addition, even those not running on Microsoft or using CrowdStrike are feeling knock-on impacts from their service providers being down. 

What can we learn from this incident?

Push the boundaries of what you consider severe yet plausible. Large-scale disruptions with as much of an impact as the CrowdStrike outage are generally dismissed by business teams as too severe or implausible and too far out of their control to do anything about should they arise. After all, a major third-party provider that underpins all your services having an incident at the same time as the global platform provider that their software works on sounds a bit implausible. But surely, after the last five years especially, we should have learnt to stop using the word “implausible”– for almost everything is plausible. It’s becoming more and more clear that in many cases, it’s a matter of ‘if’, not ‘when’. 

It’s more than just an IT issue. There has been a strong response online reporting this as an IT issue with an implication that they are simply waiting to be brought back online, yet the reality of the disruption goes far beyond that. Whilst that attitude has been prevalent in the past decade or so, there is now stronger recognition that swift action is required to mitigate impacts to customers and stakeholders. With a data-driven scenario testing programme in place that scrutinises third-party supplier risks, organisations can be better prepared to navigate disruptions to their important business services and minimise customer impacts. 

Regulatory best practices make every organisation stronger, not just the regulated ones. It’s all too easy if you’re not regulated to do the bare minimum to meet industry standards or your C-suite’s expectations, especially when budgets are tight and resources are limited. Don’t dismiss the requirements listed in regulations though – treat them as best practice guidelines and try to align to them as much as possible. In doing so, your resilience and response capabilities can be improved tenfold. If you’re unsure how Operational Resilience would look for you, do reach out to our advisory team who can translate it into a methodology that works for you. 

And, finally, bring your resilience disciplines together. Outages such as this one are often associated with cyberattacks rather than third-party failures – but this is a supply chain issue as much as it is an IT issue and should be exercised as such. You need the third-party expert in the room to understand a vendor’s vulnerabilities, and the IT expert to understand just how exposed your systems are, and then you also need the business user to understand the services the systems support and what impact a disruption could have. All too often, these three business functions rarely communicate, but you can use Fusion to help bring this crucial collaboration about.  

Turnkey UI Illustration

How much has regulation moved the dial in preparing for situations like this?

A lot, but arguably not in enough industries. 

The financial regulators in the UK, Ireland, Europe, Australia, Singapore, Canada, and the US have all been active in mandating that firms must scenario test or exercise against these kinds of crises (whether through Operational Resilience or Third-Party Risk mandates), but it’s a long process to remediate against the vulnerabilities uncovered.  

For some firms, they may be aware of these single points of failure in their supply chain but may be unable or unwilling to introduce redundancy measures. It may be that the remediating activity would have been seen as disproportionate to the threat posed (e.g., if they uncovered that their reliance on Microsoft was a single point of failure, it would be disproportionate to have Linux or Apple systems running in the background just in case). Some instead focus their efforts on mitigating the impacts to customers; as we saw today, some firms immediately put manual workarounds in place, but others simply had to cease operations altogether. 

That’s not to say that organisations should dismiss redundancy measures out of hand; where viable, they should consider having other systems running in the background or work with multiple providers to help mitigate this type of outage. With increasing numbers of cyberattacks, heavy reliance on a handful of key third-party infrastructure providers, and the speed of impacts ever increasing, investments in alternatives is all the more important.  

What regulations are coming that could help move the dial?

To date, regulators have mostly focused on the financial sector as providing services which, if disrupted, would have the most potential to cause intolerable harm. However, today may have been an apt reminder to other industries that although they don’t have to implement operational resilience programmes, they may wish to.  

We are seeing an increased focus on supply chain resilience and the need for critical third parties to up their resilience programmes. The Digital Operational Act (DORA) was introduced over the past couple of years in Europe, with the compliance deadline of 17 January 2025 rapidly approaching. Earlier this year, the UK regulators closed the consultation period for their paper on ‘Operational Resilience: Critical third parties to the UK financial sector,’ so we can expect some movement in response to this over the next year. 

Cybersecurity authorities in Switzerland and Australia commented today on the large-scale technical outage and may feel inclined to take further action off the back of this disruption. Hopefully, this might be a key moment in time where senior leadership teams begin to recognise the interconnected nature of third-party risk management, cyber resilience, and operational resilience.  

As organisations across all industries react to the CrowdStrike outage and adjust their processes accordingly, they’ll need to ensure that their recovery plans are adequate for the next disruption and that lessons learnt are tracked to completion. They will also need to have the software necessary to streamline their information and provide visibility into their data to identify outstanding vulnerabilities before an actual crisis occurs. It is a defining moment for Boards and executives to recognise that investing in software like the Fusion Framework System and incorporating backup solutions and other workarounds will enable companies to truly continue operating in the face of disruption.  

To get ready for when regulators do bring in extra measures, or simply to implement some best practices, do book a demo or reach out to our team who can advise you on the journey to resilience, underpinned by Fusion. 

Looking for more information on CrowdStrike?

Visit Fusion’s CrowdStrike Content Hub for a complete list of our resources.

Relevant sources / links to learn more

Share