Simulating the Adversary
Achieving Data Center Operational Resilience With Intelligence-Led Stress Testing
-
diciembre 19, 2025
-
Cyber attacks are an overlooked risk to data centers, and “as operational technology becomes increasingly network-connected, reliability depends as much on cybersecurity as on mechanical design.”1 Because attacks can be launched remotely and covertly, it is possible that exploitation of the networks and systems, which serve as the foundation of data centers, is not identified until it is too late, especially as threat actor tactics continue to grow in sophistication.
Keeping pace with evolving threats to data centers requires thorough protection strategies that extend beyond traditional compliance. In modern AI data centers, a cyber attack is not just focused on data theft; it is a kinetic event. Infiltrating the cooling systems of an artificial intelligence (“AI”) cluster operating at 90% capacity can cause physical fires and catastrophic hardware failure in a matter of minutes.
In addition, a growing portion of the risk sits within operational technology (“OT”) systems such as Building Management Systems (“BMS”), Electrical Power Monitoring Systems (“EPMS”), power controls, water-cooling infrastructures, and generator automation. These systems are increasingly network-connected and often managed by third parties, making them a high-value target for adversaries. Compromising OT is far more consequential than traditional IT attacks, as it enables threat actors to induce physical disruption, not just data exposure.
Combating these risks and building resilience demands continuous security planning across three domains: physical, cyber, and the supply chain. To achieve true operational resilience, data centers must shift their focus from theoretical security compliance to proving they can withstand real-world disasters. This requires a move toward intelligence-led, adversarial offensive security stress testing, a method that goes beyond standard security checks by actively simulating a determined attacker trying to disrupt operations.
Instead of testing components in isolation, this approach intentionally stresses the facility’s physical and digital systems to identify breaking points before a crisis occurs. By verifying that a failure in one area, such as a data center’s cooling sensor or power controller, remains isolated and does not cause a cascading, facility-wide outage, operators can prove they have effectively limited the blast radius of an attack. Ultimately, these tests measure an ability to defend the bottom line of a business, ensuring that even under severe pressure, the absolute minimum service levels required to prevent catastrophic financial and reputational loss are maintained.
Emerging Risks and Heightened Expectations
Threat actors often look for the easiest way to gain access to their target, and third-party vulnerabilities are an obvious place to begin. Regarding data centers, this can come in the form of connected entities in construction and operations, e.g., HVAC, cooling systems, BMS, generators, etc., who have access to data centers but may possess weaker security controls of their own.
In addition to the low-hanging fruit types of exploitation, supply chain concerns can extend to foreign-manufactured components. It is possible to design these components with backdoors, providing the manufacturers with direct access to sensitive and valuable U.S. intellectual property. These backdoors can be persistent and remain even after multiple system resets, providing threat actors with unfettered access.
Another overlooked exposure occurs during the construction and commissioning phases. Vendor firmware, Programmable Logic Controller (“PLC”), installer configurations, and subcontractor laptops can introduce vulnerabilities months before the data center becomes operational.
The construction supply chain is also a target of threat actors who are seeking an easier entry point and want to either establish access to monitor activity, steal information, or delay builds by sabotaging construction efforts, depending on their desired outcome. For example, disrupting the operations of a concrete plant can delay essential concrete pouring needed for data center builds, while taking a materials shipping company offline can halt deliveries, both scenarios creating significant downstream impacts.
Perpetuating the threat to data centers is the concentration risk across suppliers and skilled labor pools. Once a threat actor has gained access to one vendor, it becomes easier to leverage that access to gain entry elsewhere, moving laterally through systems and escalating privileges along the way. Alternatively, because provider options are limited to a select few, disruptions impacting these concentrated resources could disrupt multiple data centers simultaneously, magnifying security risks and underscoring the importance of cyber resilience and contingency plans.
As a result of myriad risks, regulatory scrutiny is increasing, both for the security of systems and for personnel screening. Regulators want to ensure that proper protections are in place, that organizations can demonstrate the effectiveness of cybersecurity controls, and that the people responsible for executing corresponding tasks and overseeing programs are legitimate. Heightened expectations carry the potential for enforcement actions and penalties, presenting additional risks that data center operators must consider.
Benefits of Enhanced Stress Testing
Threat intelligence-led testing, which involves simulating a cyber attack to reveal vulnerabilities, allows for the early identification of gaps and determines what areas should be the focus of improvement efforts. This approach simulates real-world threats by leveraging the same tools and tactics used by actual adversaries. Rather than testing systems or networks in isolation, a threat-led approach conducts an end-to-end simulation that evaluates people, processes, and technology against attacks observed in the wild. This methodology identifies gaps and refines crisis response capabilities, ensuring real crises can be managed in a timely manner and operate within determined risk appetites. Testing can be augmented using AI tools to simulate sophisticated, evolving attack patterns that human red teams might not conceive, increasing the intensity and inventiveness of the test.
Through integrated threat intelligence-led penetration testing and business response testing, organizational muscle memory for crisis response is built, applying the organizational-specific results of the stress test. Focusing on a multi-point attack simulation allows for data centers to be better prepared for a crisis, reducing downtime and ensuring business continuity. Testing single components in isolation provides a false sense of security. Simulating multi-point attacks, simultaneous strikes designed to overwhelm defenders and expose operational blind spots, forces teams to break from siloed thinking and naïve crisis response habits. By disrupting the status quo, unconventional strategies emerge, ensuring the elite performance is delivered when real-world crises occur.
Additionally, conducting these tests during the design phase of a data center versus at the operational stage allows for proactive risk management efforts to be implemented. Stress testing helps identify risks before they can be exploited and leverages a secure-by-design approach, integrating security measures from the onset, when impactful change is still possible, rather than adding them in a retroactive piecemeal approach after construction is complete. This methodology can provide significant cost efficiencies, while also mitigating vulnerabilities and helping meet regulatory expectations.
It is worth noting that data centers will need to overcome certain operational challenges while strategizing for better offensive security testing. For systems that are too critical to break in real life, the use of digital twins is recommended.2 Testing should happen in these high-fidelity simulations to predict how failures propagate, without disrupting critical dependencies.
Strengthening Security Standards
While a secure-by-design approach is not currently mandated for data centers, establishing baseline security standards during the design phase will help significantly enhance data center protections across the board.
In a similar vein, security standards for data centers should be elevated to match the more robust requirements imposed upon certain critical infrastructure providers. Data centers support essential services, e.g., financial systems, telecommunications, healthcare, etc., yet do not face the same stringent obligations of other entities with similar fundamental reliance capabilities.
For instance, U.S. nuclear centers have rigorous security requirements and as a result are “considered among the most secure of the nation’s critical infrastructure.”3 Given the importance of data centers to everyday life, and the relative lack of attention paid to fixable vulnerabilities, these same baseline requirements should be extended to data centers to ensure their security as well.
Although stricter data center obligations may slow construction speeds – often a key priority for owners – the industry must embrace this shift to effectively address and mitigate growing risks. Operational security is vital for data centers but addressing threats before they have a chance to disrupt operations must also be a priority, specifically regarding supply chain security, which includes the labor force.
Power distribution, BMS, and liquid cooling infrastructures are all managed by interconnected OT, and previous attacks, which weren’t data thefts, have shown how damaging outcomes are possible, e.g., the Stuxnet attack on nuclear centrifuges and the Triton malware in petrochemical plants. In a data center, a similar intrusion into cooling systems or power controls could override safety protocols, cut generators, or manipulate chiller setpoints. For healthcare, this means surgeries disrupted; for finance, trades frozen; for AI, millions of dollars in GPUs fried in an instant.
Protecting the full digital ecosystem is essential to ensuring data center resilience, and these efforts can be aided through regulators developing requirements for comprehensive third-party risk management programs.
Building Resilience and Elevating Data Center Security
Subjecting data centers to stringent cybersecurity protections requirements will raise the collective bar across the industry, and risks can be mitigated before they become crises. In the absence of baseline standards, moving beyond check-the-box compliance is necessary for data centers to build true resilience, which can be achieved through threat intelligence-led stress testing.
Blending proactive security and design principles with robust response capabilities will improve data center resiliency, helping to develop the ability to withstand a cyber attack, while potentially limiting the impact to business continuity; a legitimate game-changer and value-add for data centers.
Footnotes:
1: Ferrante, Anthony J. and Greg Parker, “The Hidden Risk for Data Centers That No One is Talking About,” FTI Consulting (September 1, 2025).
2: “Digital Twins,” National Institute of Standards and Technology (accessed December 3, 2025).
3: “Backgrounder on Nuclear Security,” United States Nuclear Regulatory Commission (November 2024).
Artículos relacionados
Servicios relacionados
Publicado
diciembre 19, 2025
Contactos clave
Senior Managing Director, Global Head of Cybersecurity
Managing Director
Managing Director
Senior Director