top of page

AI Data Centres: The Hidden Economics of Maintenance, Cyber Resilience, and Capital Protection

  • SEMNET TEAM
  • Jan 9
  • 7 min read

A White Paper


By Jacob Ukelson, CTO, Netformx and Melvin Chan, CEO, SEMNet
Published 5 January 2026

Executive Summary


AI data centres are among the most capital-intensive infrastructure assets ever built, yet most investment and operating models dangerously underestimate the impact of maintenance failure, lifecycle drift, and cyber exposure on long-term returns. There is also about to be an unprecedented influx of new datacenters that will increase competition and put pressures on operating data centres operation costs. Unlike traditional data centres, AI facilities fail catastrophically: a single unsupported firmware, end-of-life network device, or lateral cyber breach can invalidate training runs, strand GPUs, expose proprietary data, and permanently impair ROI.


This white paper argues that maintenance intelligence and cryptographic control are no longer operational afterthoughts but core capital-protection mechanisms. By combining Netformx’s continuous infrastructure lifecycle assurance with Vaultrex’s assume-breach cryptographic data governance, AI data centres can materially reduce capex obsolescence, lower ongoing opex, and preserve availability, security, and economic value across the full lifecycle of AI infrastructure investments. These assurances also allow the datacentre to negotiate premium pricing with respect to competitors which cannot make these assurances.



The global race to build AI data centres has been framed almost entirely around scale: more GPUs, denser racks, faster interconnects, and cheaper power. Governments speak of sovereign AI. Hyperscalers speak of AI factories. Investors speak of capacity, yield, and time-to-market.


Yet beneath this dominant narrative lies a critical omission. AI data centres are not software platforms that fail gracefully. They are capital-intensive cyber-physical systems with highly specialized components, tight operational tolerances, and irreversible failure modes. When things go wrong, the consequences are not measured in minutes of downtime but in lost revenue through training cycles, stranded GPUs, breached data assets, and permanent erosion of competitive advantage.


This is where the AI data-centre story is incomplete. Everyone talks about building AI infrastructure. Almost no one talks about how it must be maintained, governed, and defended continuously to preserve its economic value and ROI.


AI data centres represent one of the largest concentrations of capital expenditure ever deployed into operational infrastructure. GPUs are supply-constrained. Cooling and power systems are finely balanced. East-west traffic dominates. Firmware compatibility, lifecycle alignment, and configuration integrity matter in ways that traditional data centres never experienced. In this environment, maintenance failures and cyber exposures do not merely coexist; they compound each other.


A single unsupported switch firmware, an end-of-life aggregation device, or a misconfigured management plane can silently undermine both availability and security. Worse, such weaknesses often remain invisible until they are exploited by attackers, by cascading faults, or by routine operational changes that exceed hidden tolerances.


Traditional approaches to data-centre infrastructure management are structurally unfit for this reality. Periodic audits, static Configuration Management Database (CMDBs, annual penetration tests, and siloed operational tools were designed for environments where infrastructure evolved slowly and failures were recoverable. AI data centres evolve continuously, and failures are often irreversible. Once a training run is invalidated or sensitive training data is exposed, there is no rollback.


What is required is a new operating model—one that treats maintenance posture, lifecycle integrity, and cyber exposure as a single, continuously governed system rather than as separate disciplines.


This is the context in which Netformx and Vaultrex must be understood.


Netformx is not simply a network assessment or lifecycle management tool. In AI data centres, it functions as an infrastructure assurance layer. It builds a continuously updated, deeply contextual model of the physical and virtual network fabric: switches, routers, firewalls, load balancers, virtual overlays, firmware versions, module dependencies, support status, and end-oflife timelines. Crucially, it does not merely inventorise assets; it exposes where lifecycle drift, unsupported components, and configuration weaknesses intersect with critical data paths.


In AI environments, these intersections matter disproportionately. A top-of-rack switch serving GPU clusters is not equivalent to a peripheral device, even if both appear similar in a CMDB. An aggregation switch that sits on dozens of east-west paths is not just another network node; it is a single point through which training data, model checkpoints, and orchestration traffic flow. Netformx reveals these relationships and ties them directly to maintenance risk, obsolescence risk, security risk and operational fragility.


However, maintenance intelligence alone is insufficient. AI data centres operate under an “assume breach” reality. Once an attacker gains an initial foothold—through a compromised workload, a partner connection, or a misconfigured interface—the real risk lies in silent lateral movement. Traditional perimeter defences offer little protection in environments dominated by east-west traffic and software-defined segmentation.


This is where Vaultrex fundamentally changes the security model.


Vaultrex is not positioned as another encryption feature or security add-on. It represents a cryptographic control plane for AI data centres. Data is encrypted at rest by default and decrypted only under tightly controlled, auditable conditions. Plaintext exposure is minimised by design. Access is governed cryptographically rather than implicitly trusted based on network location.


In AI data centres, this matters because training data and models are not merely sensitive—they are economically irreplaceable. A silent exfiltration of training datasets or model artefacts does not trigger alarms in traditional security systems, yet it permanently erodes competitive advantage. Vaultrex ensures that even if infrastructure is traversed, data remains cryptographically governed rather than opportunistically exposed.


The true power of Netformx and Vaultrex emerges when they are combined into a single operating discipline aligned with Continuous Threat Exposure Management (CTEM). This is not CTEM as a generic security framework, but CTEM adapted for the economics of AI infrastructure.


In this unified model, discovery is continuous and comprehensive, covering not just devices but lifecycle status, supportability, configuration hygiene, and management plane exposure. Diagnosis goes beyond listing vulnerabilities and instead models realistic attack paths through the actual AI data-centre topology, identifying where lifecycle weaknesses create exploitable routes to high-value assets. Decision-making is driven by economic impact: which interventions collapse the most dangerous attack paths while preserving uptime and minimizing disruption to AI workloads.


Remediation becomes targeted and capital-aware. Upgrading or replacing a single high-impact device can eliminate dozens of attack paths and materially reduce both cyber and operational risk. Vaultrex ensures that even during periods of change, data remains protected by cryptographic controls rather than procedural assurances. Validation then closes the loop, proving that changes have reduced exposure and that lifecycle posture has improved measurably.


This continuous loop transforms maintenance from a reactive cost centre into a form of capital insurance. It allows AI data-centre operators to demonstrate, quantitatively, that their infrastructure is not only powerful but resilient, governable, and economically protected.


Quantifying the Problem: AI Data-Centre Capex Obsolescence


Below are order-of-magnitude estimates grounded in hyperscale and sovereign AI deployments. These are based on conservative estimates.


Typical AI Data Centre (Illustrative Baseline)


Assume a 50–100 MW AI data centre:

Component

Approx Share of Capex

GPUs & AI servers

45–55%

Networking (DC fabric, ToR, spine, optical)

10–15%

Power & cooling

20–25%

Facilities & others

10–15%

Total Capex: USD $500M – $1.5B (depending on density and geography)



Capex Obsolescence Without Continuous Maintenance Intelligence


Industry evidence (including hyperscalers, HPC clusters, and AI pilots) shows:


  1. Early Obsolescence Rate

    1. 5–10% of network and infrastructure assets become functionally obsolete within 24–36 months, not because they are broken, but because:

      1. firmware incompatibility

      2. unsupported software

      3. vendor EoL/EoS

      4. inability to meet evolving AI traffic patterns

  2. Hidden Capex Loss

    1. For a $1B AI data centre:

      1. 10–15% networking + control-plane capex = ~$120M

      2. 5–10% premature obsolescence = $6–12M per year

      3. Over 5 years = $30–60M in avoidable capex erosion

  3. GPU Stranding Effect (Often Ignored)

    1. When network or management-layer failures occur:

      1. GPUs sit idle

      2. training jobs are aborted

    2. A conservative estimate:

      1. 2–4% GPU underutilisation due to infrastructure fragility

      2. On $500M of GPUs → $10–20M/year in wasted capital productivity


Combined capex impact (obsolescence + stranding):

USD $15–30M per year for a $1B AI data centre


Quantifying Opex Leakage Without Netformx


Opex Inefficiencies Observed

Without continuous lifecycle and exposure intelligence:

  • Reactive maintenance

  • Emergency patching

  • Unplanned downtime

  • Extended mean-time-to-repair (MTTR)

  • Manual audits and compliance work


Conservative opex leakage estimates:

Opex Area

 Typical Impact

Emergency remediation & firefighting

+10–15%

Downtime & aborted training runs

+5–10%

Manual audits, compliance, tooling sprawl

+5%

For a data centre with $50–100M annual opex:

$10–20M/year in avoidable opex inefficiency

What Netformx (and Vaultrex) Change Quantitatively


Capex Preservation Impact

Netformx delivers:

  • early identification of lifecycle drift

  • targeted upgrades instead of blanket refreshes

  • avoidance of “forced obsolescence”


Conservative impact:

  • Reduce premature infrastructure replacement by 30–50%

  • Preserve $10–30M in capex over 5 years for a $1B AI data centre


GPU Productivity Uplift

By stabilising:

  • east-west paths

  • firmware compatibility

  • management-plane integrity


Netformx reduces GPU idle time caused by infrastructure issues.


Conservative impact:

  • Recover 1–2% GPU utilisation

  • Equivalent to $5–10M/year in capital productivity


Opex Reduction Impact

With continuous visibility and prioritized remediation:

  • Fewer emergency changes

  • Shorter MTTR

  • Less manual audit effort


Conservative impact:

  • 15–25% reduction in infrastructure-related opex inefficiencies

  • Equivalent to $5–10M/year for a large AI data centre


Vaultrex’s Quantifiable Contribution (Risk-Adjusted)


Vaultrex primarily protects downside risk, which is harder to price but critical. As the models created in AI datacenters become more valuable criminals will go the more advanced methods to obtain the models – including rogue physical taps into the infrastructure. Vaultrex negates this risk by ensuring that the high value model data is encrypted and unaccessible to criminals and competitors. It also ensures that even if a tap is uncovered the data itself is still secure.


Vaultrex’s contribution:

  • Reduces probability of:

    • catastrophic data leakage

    • model theft

    • regulatory shutdowns

  • Converts existential cyber risk into quantifiable residual risk


A single major breach in an AI data centre can conservatively cost:

$50–200M (data loss, retraining, legal, reputational damage)


Vaultrex does not eliminate risk, but materially lowers the tail risk, which directly improves:

  • insurability

  • financing terms

  • sovereign/regulatory confidence


As AI data centres proliferate, they will increasingly resemble industrial assets rather than IT environments. Investors, regulators, and sovereign stakeholders will demand proof that these assets are not becoming stranded through neglect, obsolescence, or silent compromise.


Netformx and Vaultrex together define the future operating model required to meet that expectation. They shift the conversation from “how big is your AI cluster” to “how well is your AI capital protected.” They transform AI data centres from collections of expensive machines into continuously governed systems where maintenance intelligence, cyber resilience, and ROI preservation are inseparable.


In the coming phase of AI infrastructure development, the winners will not be those who build the most GPUs, but those who ensure their AI data centres remain operationally sound, economically defensible, and strategically secure over time. Netformx and Vaultrex are not peripheral to that future. They are foundational to it.

Comments


bottom of page