STEPHEN OSLER: CrowdStrike outage — a wake-up call for the cybersecurity industry
The scale of the outage has sparked talks about cybersecurity and vendor accountability, and the incident could be a turning point for the industry
26 July 2024 - 05:00
byStephen Osler
Support our award-winning journalism. The Premium package (digital only) is R30 for the first month and thereafter you pay R129 p/m now ad-free for all subscribers.
Friday’s global IT outage, triggered by a faulty CrowdStrike update, sent shock waves through the tech world. As the dust settles, we in the cybersecurity industry are taking stock of the incident’s far-reaching implications.
July 19 was one of the busiest days I’ve had in the last 25 years. My first thought was that there were targeted attacks against SA businesses, but ultimately we found it was the global outage caused by the CrowdStrike update.
This incident, described as the largest IT outage in history, affected over 8.5-million Microsoft devices worldwide. Its impact was felt across multiple sectors, grounding flights, disrupting banking and healthcare services, and causing widespread business interruptions. Early estimates suggest the costs could run into billions of dollars.
As we grappled with the fallout, my team and I were on the front lines, helping clients respond and recover. We helped clients mitigate and remediate quickly. Most of them had recovered by midmorning.
Clarifying misconceptions
However, a week after the incident confusion still lingers. The biggest challenge we are seeing after Friday is that there is a lot of misunderstanding about exactly what went wrong and who was responsible for the outage. Some are still pointing fingers at Microsoft, but the confusion does not help the cause.
As an industry we need a clear understanding of the event’s root causes. This could have happened to anyone. Most major cybersecurity and software vendors have released faulty updates at some stage. This was so significant purely because of the scale of the software deployment and the fact that CrowdStrike has a Microsoft Kernel-Mode Code Signing Certificate.
Having such a certificate shows Microsoft considers the software to be genuine and secure. It allows CrowdStrike to quickly deploy applications into the core of the operating system to address cyber risks. While all IT vendors have encountered problematic files affecting users, the severity of this case was unprecedented. Usually, you simply roll back the deployment, but because this one was running in the kernel it was a tough recovery.
A catalyst for change
The unprecedented scale of the outage has sparked intense discussions about cybersecurity practices, vendor accountability and the risks associated with centralised IT services. I believe this incident could be a turning point for our industry.
Vendor accountability, testing and third-party risk management all come into play. It has opened a can of worms in terms of questions, and only in the coming weeks will we be able to answer these better.
One of the most promising developments emerging from this crisis is the possibility of a new collaborative approach to software testing and deployment. I envision a global testing alliance that could revolutionise the validation of updates before release.
A global outage affecting Microsoft services hit airlines, banks and health systems. Picture: MAILEE OSTEN-TAN/REUTERS
There is the potential for a deployment alliance, where member vendors subscribe to best practice methodologies for testing software updates before deployment. A signing authority could also validate certain procedures. This would show vendor alignment with global best practice, and give assurances to customers.
This concept aligns with our long-standing advocacy for a collaborative defence model in cybersecurity. Such an alliance could greatly reduce the risk of similar incidents in the future while fostering greater trust between vendors and their clients.
The road ahead
The incident has highlighted the delicate balance between rapid response to cyberthreats and ensuring system stability. We are so at the forefront of staying ahead of cyber risks that some controls may have gone out of the window.
As the industry moves forward, the lessons learnt from this incident will shape cybersecurity practices for years to come. CrowdStrike has already announced plans to improve its testing procedures and implement a staggered deployment strategy for updates.
The incident is likely to cause some post-traumatic stress disorder in the industry and drive all vendors to be more rigorous about testing. While the full ramifications of the outage are still unfolding, one thing is clear: it has irreversibly altered the cybersecurity landscape.
As organisations worldwide re-evaluate their IT strategies and vendors revamp their processes, our industry is ready for a new era of collaboration, accountability and resilience.
• Osler is cofounder and business development director at Nclose.
Support our award-winning journalism. The Premium package (digital only) is R30 for the first month and thereafter you pay R129 p/m now ad-free for all subscribers.
STEPHEN OSLER: CrowdStrike outage — a wake-up call for the cybersecurity industry
The scale of the outage has sparked talks about cybersecurity and vendor accountability, and the incident could be a turning point for the industry
Friday’s global IT outage, triggered by a faulty CrowdStrike update, sent shock waves through the tech world. As the dust settles, we in the cybersecurity industry are taking stock of the incident’s far-reaching implications.
July 19 was one of the busiest days I’ve had in the last 25 years. My first thought was that there were targeted attacks against SA businesses, but ultimately we found it was the global outage caused by the CrowdStrike update.
This incident, described as the largest IT outage in history, affected over 8.5-million Microsoft devices worldwide. Its impact was felt across multiple sectors, grounding flights, disrupting banking and healthcare services, and causing widespread business interruptions. Early estimates suggest the costs could run into billions of dollars.
As we grappled with the fallout, my team and I were on the front lines, helping clients respond and recover. We helped clients mitigate and remediate quickly. Most of them had recovered by midmorning.
Clarifying misconceptions
However, a week after the incident confusion still lingers. The biggest challenge we are seeing after Friday is that there is a lot of misunderstanding about exactly what went wrong and who was responsible for the outage. Some are still pointing fingers at Microsoft, but the confusion does not help the cause.
As an industry we need a clear understanding of the event’s root causes. This could have happened to anyone. Most major cybersecurity and software vendors have released faulty updates at some stage. This was so significant purely because of the scale of the software deployment and the fact that CrowdStrike has a Microsoft Kernel-Mode Code Signing Certificate.
KATE THOMPSON DAVY: CrowdStrike outage is a boon for the anti-concentration cause
Having such a certificate shows Microsoft considers the software to be genuine and secure. It allows CrowdStrike to quickly deploy applications into the core of the operating system to address cyber risks. While all IT vendors have encountered problematic files affecting users, the severity of this case was unprecedented. Usually, you simply roll back the deployment, but because this one was running in the kernel it was a tough recovery.
A catalyst for change
The unprecedented scale of the outage has sparked intense discussions about cybersecurity practices, vendor accountability and the risks associated with centralised IT services. I believe this incident could be a turning point for our industry.
Vendor accountability, testing and third-party risk management all come into play. It has opened a can of worms in terms of questions, and only in the coming weeks will we be able to answer these better.
One of the most promising developments emerging from this crisis is the possibility of a new collaborative approach to software testing and deployment. I envision a global testing alliance that could revolutionise the validation of updates before release.
There is the potential for a deployment alliance, where member vendors subscribe to best practice methodologies for testing software updates before deployment. A signing authority could also validate certain procedures. This would show vendor alignment with global best practice, and give assurances to customers.
This concept aligns with our long-standing advocacy for a collaborative defence model in cybersecurity. Such an alliance could greatly reduce the risk of similar incidents in the future while fostering greater trust between vendors and their clients.
The road ahead
The incident has highlighted the delicate balance between rapid response to cyberthreats and ensuring system stability. We are so at the forefront of staying ahead of cyber risks that some controls may have gone out of the window.
As the industry moves forward, the lessons learnt from this incident will shape cybersecurity practices for years to come. CrowdStrike has already announced plans to improve its testing procedures and implement a staggered deployment strategy for updates.
The incident is likely to cause some post-traumatic stress disorder in the industry and drive all vendors to be more rigorous about testing. While the full ramifications of the outage are still unfolding, one thing is clear: it has irreversibly altered the cybersecurity landscape.
As organisations worldwide re-evaluate their IT strategies and vendors revamp their processes, our industry is ready for a new era of collaboration, accountability and resilience.
• Osler is cofounder and business development director at Nclose.
Would you like to comment on this article?
Sign up (it's quick and free) or sign in now.
Please read our Comment Policy before commenting.
Most Read
Related Articles
Insured losses from CrowdStrike outage could reach $1.5bn
JOHAN STEYN: CrowdStrike’s flawed update exposed global tech vulnerabilities
CrowdStrike outage ‘affected 8.5-million Microsoft devices’
Microsoft outage exposes vulnerabilities to global tech
Capitec and eNCA offline earlier on Friday in worldwide outage
Published by Arena Holdings and distributed with the Financial Mail on the last Thursday of every month except December and January.