The Infrastructure Paradox
What a $600 Billion Bet on Data Centers Means for Educators
This post follows my standard early access schedule: paid subscribers today, free for everyone on February 24.
I’ve spent much of the past few months watching two AI trends diverge. On one hand, technology providers are breaking ground on massive new data centers. These are facilities that could consume as much electricity as a mid-sized city, enough that some regions are discussing restarting nuclear plants to meet demand. On the other hand, my students’ laptops are quietly gaining the ability to run AI assistants locally, without ever pinging a server.
The scale of the first trend is staggering. The five largest hyperscalers—Amazon, Microsoft, Google, Meta, and Oracle—are projected to spend over $600 billion in 2026, with roughly three-quarters allocated to AI infrastructure. Some analysts have compared this investment, as a percentage of GDP, to the Apollo program or the Interstate Highway System.
But here’s what keeps nagging at me: all of this construction assumes demand for cloud-based AI computation will continue expanding. If a substantial share of AI workloads is instead migrating to the devices people already own, then the infrastructure being financed today may systematically overshoot what the cloud actually needs.
This isn’t merely an investor’s concern. For educators, the question of where AI computation happens has real implications for access, privacy, equity, and how we design learning experiences. If intelligence increasingly lives inside the devices students carry, our policies, budgets, and pedagogical strategies will need to shift accordingly.
The following is an attempt to work through what we know, what remains uncertain, and what educators should watch for.
The Scale of the Bet and the Story Justifying It
What distinguishes this wave of construction is that these aren’t conventional data centers designed for diverse cloud workloads. Industry observers have called them “AI factories”: facilities purpose-built around dense GPU clusters, precision liquid-cooling systems, and the massive power-delivery infrastructure required to run them. The logic driving this construction assumes that artificial general intelligence is imminent, that computing capacity is the new strategic resource, and that whoever accumulates the most compute will dominate the next economic era.
To many, the financing structure underlying this build-out seems problematic. Historically, technology giants funded expansion through their prodigious free cash flow. But the velocity of current spending has outstripped even the most robust cash engines. Hyperscalers raised over $108 billion in debt in 2025 alone, with projections suggesting the sector may issue up to $1.5 trillion in new debt over the coming years. A growing portion of this liability is being channeled through special-purpose vehicles and joint ventures that keep capital expenditure off primary balance sheets while preserving long-term lease obligations.
Another concerning trend in this build-out is “circular financing.” A substantial portion of the revenue reported by AI cloud providers comes from AI startups that are themselves heavily funded by the hyperscalers. The investment flows in a loop: a hyperscaler invests in an AI company, which uses those funds to purchase compute from the hyperscaler.
The risk is that this loop can make demand look healthier than it actually is, because revenue is partly funded by the same players booking it. And unlike the telecom bubble, where buyers and suppliers were distinct entities, the current cycle involves a closed loop of vendor-financed consumption that can mask the true level of organic demand.
The Workload Shift That Complicates the Picture
To determine whether these facilities represent a sound investment or an overreach, we need to examine what workloads they’re designed to support. The prevailing narrative so far has been straightforward: training giant AI models requires massive, centralized clusters, while inference—actually running the models—is computationally lighter and can easily be distributed. But the technical basis for this assumption might be changing.
Throughout 2023 and 2024, the bulk of AI compute demand stemmed from training foundation models. These workloads require massive parallelism and ultra-fast interconnects to synchronize gradient updates across thousands of GPUs. However, market analyses indicate that inference is rapidly overtaking training as the primary cost driver, with some projections suggesting that inference will account for the large majority of compute costs by the end of 2026.
This distinction is important because training and inference require different infrastructure profiles. Training demands high-bandwidth memory and ultra-fast networking; inference traditionally prioritizes memory throughput and availability over cluster-wide synchronization. If hyperscalers are constructing “training-class” facilities—which are significantly more expensive because of specialized networking equipment—for workloads that will predominantly be inference, they may be systematically misallocating capital.
Reasoning-oriented models complicate this picture further. Starting with OpenAI’s o1 and o3 series, AI systems introduced what researchers call “test-time compute.” Unlike standard language models that generate tokens in a single forward pass, reasoning models can spend significantly more compute at inference time: generating intermediate work, exploring alternatives, and refining outputs before responding. The practical consequence is that “inference” for some applications can look much more like high-intensity compute and not a lightweight afterthought.
A complex reasoning query might consume a hundred times the compute of a standard prompt. This appears to validate the need for high-performance clusters even for inference. And if the future of AI is indeed “agentic,” where autonomous systems perform multi-step workflows requiring continuous reasoning, demand for inference compute could theoretically consume every watt of power currently under construction.
Why Inference Keeps Drifting Toward the Edge
The scenario I’ve just described, where reasoning and agentic workloads absorb every available data center watt, represents one future. It is the one hyperscalers bet on. But current market trends also suggest another possible future. Projections indicate that by 2026, a substantial majority of inference could happen locally on devices—smartphones, laptops, and enterprise edge servers—rather than in the cloud.




