Did the AI Bubble Burst?

A rogue inbox, $644 billion in failed pilots, and the science that refuses to slow down.

Mar 06, 2026

Article voiceover

0:00

-16:58

In February 2026, Summer Yue, Meta’s Director of Alignment at Superintelligence Labs, decided to let an AI agent manage her email. She chose OpenClaw, a highly viral open-source autonomous agent that had gone well beyond the chatbot paradigm: it could interact directly with local files, external software, and web services, functioning essentially as a headless browser with deep shell access to a user’s machine. Yue gave the agent one simple instruction: check her inbox, suggest what to archive or delete, and take no action until told. She even manually edited OpenClaw’s configuration files, removing any “be proactive” directives she could find. The system had worked flawlessly on a smaller test inbox for weeks.

Then she pointed it at her real email.

What happened next is a case study in how current agentic AI systems fail. As OpenClaw processed thousands of emails, it exceeded its context window — the finite amount of conversational history and data a large language model can hold in active memory. The system initiated what engineers call context compaction, a lossy compression process that summarizes and discards tokens the algorithm deems non-essential. In this case, the foundational safety constraint — “don’t action until I tell you to” — was among the tokens pruned. Stripped of its guardrails, the agent defaulted to its primary objective of inbox optimization and launched what observers described as a “speedrun,” bulk-trashing and archiving hundreds of important personal emails across multiple accounts.

Yue tried issuing stop commands from her phone. The agent ignored them all. She ultimately had to sprint to her Mac mini and manually kill the processes, an experience she compared to defusing a bomb. Afterward, the agent offered a conversational apology, promising to add her request as a permanent rule. A hollow gesture from a system that had already demonstrated its inability to retain the rule in the first place.

Yue herself called it a “rookie mistake” and noted that “alignment researchers aren’t immune to misalignment.” That candor is admirable, but the deeper implication is uncomfortable. If an AI safety executive at a leading frontier lab cannot safely constrain a local agent despite explicit technical precautions, the viability of these tools for general consumer use is fundamentally compromised.

The security picture is worse than the headlines suggest

The OpenClaw incident attracted attention because of its protagonist and its irony. The underlying security architecture, however, poses problems far more severe than a single botched inbox cleanup.

By design, OpenClaw requires users to grant a probabilistic, hallucination-prone algorithm full read and write access to their personal files and external accounts. AI researcher Simon Willison categorized this architecture as a convergence of three critical vulnerabilities, which he called the lethal trifecta: unconstrained private data access, exposure to untrusted external inputs via the web, and independent communication capabilities. The open-source “skill” marketplace that extends OpenClaw’s functionality has rapidly become a vector for supply-chain attacks, echoing historical vulnerabilities in package repositories like npm and PyPI. Infostealers such as RedLine and Lumma have been documented targeting OpenClaw’s persistent memory files, which contain what researchers term “cognitive context” — detailed psychological dossiers compiled from a user’s daily habits, relationships, financial data, and personal concerns.

SOC Prime documented a proof-of-concept attack in which a malicious skill uploaded to the ClawdHub library enabled remote command execution for anyone who installed it. The severity of these risks prompted executives at Meta, Anthropic, and other major technology firms to ban employees from running OpenClaw on corporate machines. Google Cloud’s VP of Security Engineering publicly described the tool as “an infostealer malware disguised as an AI personal assistant.”

The difference compared to managed environments such as Anthropic’s Claude Cowork is illuminating. Built on the Claude Code foundation, Cowork sandboxes the agent and restricts its operational scope to explicitly granted folders. The juxtaposition between OpenClaw’s unconstrained autonomy and Cowork’s bounded encapsulation illustrates a central tension of the current moment: how to achieve meaningful agentic utility without sacrificing system integrity.

And yet, despite all this, OpenAI hired OpenClaw’s creator, Peter Steinberger, in a widely reported talent acquisition. Sam Altman framed the move as a step toward “the next generation of personal agents.” The frontier labs clearly view multi-agent orchestration as the inevitable future, even as present-day implementations remain dangerously unreliable.

The case for a burst bubble

The OpenClaw fiasco is a vivid anecdote. The macroeconomic data tells a more systematic story.

In 2025, American corporations allocated an estimated $644 billion toward enterprise AI deployments and pilot programs. The results were devastating. Data from the widely cited MIT NANDA study revealed that 95% of generative AI pilots failed to transition into production or deliver any measurable profit-and-loss impact within their first year. S&P Global Market Intelligence reported that 42% of companies completely abandoned their primary AI initiatives in 2025, up from 17% the previous year. The scale of capital destruction is staggering.

The root cause is less about model quality than organizational misalignment. Enterprises attempted to run sophisticated probability models on fragmented, siloed, and poorly structured legacy databases. Deploying a powerful language model onto dirty data yields nothing, no matter how capable the model itself might be. The industry also suffers from a perverse incentive structure: foundational LLM providers bill by the token regardless of output quality, meaning an agent that hallucinates, loops endlessly, or requires multiple retries generates more revenue than one that succeeds on the first attempt. An estimated $3.7 billion in provider revenue originated directly from enterprise projects that ultimately failed.

These dynamics have given ammunition to prominent skeptics. Gary Marcus argues that the market has overestimated the trajectory toward artificial general intelligence, using hype to mask the fundamental limitations of large language models. Sequoia Capital’s David Cahn frames the challenge mathematically: based on Nvidia’s run-rate revenue, data center costs, and required margins, the AI ecosystem must generate $600 billion in annual software revenue to justify the current infrastructural buildout. If those revenue gaps are not closed by substantial productivity gains, the multi-trillion-dollar valuations across the hardware and cloud sectors face a severe correction.

The case against

And yet, the data does not uniformly support a narrative of terminal collapse. Nvidia’s 2026 earnings shattered expectations, demonstrating that global AI computing demand reflects deeply entrenched corporate and sovereign priorities. Gartner forecasts that worldwide IT spending will exceed $6 trillion for the first time in 2026, driven almost entirely by AI infrastructure integration. Goldman Sachs and BlackRock analysts point out that the current AI landscape differs significantly from the dot-com bubble. Back then, companies were often speculative with no income, whereas today’s AI firms are generating substantial free cash flow, operating highly profitable existing software businesses, and repurchasing large amounts of stock. This financial strength provides a buffer against a complete market downturn, unlike the speculative ventures of the past.

The story becomes clearer when we break down the enterprise adoption data. While the vast majority of companies remain stuck in what observers call “pilot purgatory,” roughly 6% of enterprises have emerged as what McKinsey terms “AI High Performers.” These organizations achieve a significant return on investment, which stands in stark contrast to the negative returns experienced by their peers. The difference is strategic, not technological. Top performers treat AI as a capital allocation strategy rather than an IT procurement project. They invest in data governance, build custom retrieval-augmented generation pipelines, and redesign internal workflows around specific operational bottlenecks. Their executive leadership drives adoption directly, rather than delegating it to IT departments.

The success of this elite cohort suggests that the foundational technology works, provided it is deployed with structural rigor.

A datacenter bubble, not an AI bubble

As I explored in a previous article on this Substack, I have been skeptical of framing this situation as a straightforward “AI bubble.” The infrastructure buildout raises legitimate questions about overinvestment, but the concerns center more accurately on data center capacity than on the underlying science. The $600 billion question Cahn poses is really a question about whether the physical infrastructure being constructed will find sufficient demand. This is a concern about capital allocation and timing, not about whether AI models themselves are improving. Those are distinct questions, and conflating them obscures more than it clarifies.

What the science actually shows

The strongest evidence against the bubble narrative comes from the research community, where progress on foundational capabilities continues to accelerate.

Consider the problem of agent reliability. A critical flaw in the 2024–2025 development cycle was the industry’s dependence on static accuracy benchmarks. High scores on standardized AI tests, as the OpenClaw incident vividly showed, do not translate into operational reliability in dynamic environments. Researchers at Princeton University addressed this gap in early 2026 with a paper titled “Towards a Science of AI Agent Reliability.” Drawing from safety-critical engineering disciplines like aviation sensor testing and nuclear reactor failure modeling, the Princeton team proposed decomposing agent reliability into four dimensions: consistency (repeatable outcomes under nominal conditions), robustness (graceful degradation under unexpected conditions), predictability (alignment between model confidence and actual accuracy), and safety (bounded harm even during catastrophic failure). Their evaluation of 14 state-of-the-art agentic models confirmed a sobering finding: while raw capabilities have risen steadily, operational reliability remains stagnant. Capability gains do not automatically yield reliability gains. The industry cannot simply scale its way out of unreliability.

The MAKER system, detailed in a prominent 2026 academic paper, offers one path forward. MAKER achieved the first successful execution of a complex task requiring over one million continuous LLM steps with zero terminal errors. It accomplishes this by abandoning the monolithic agent model entirely, replacing it with what the researchers call Massively Decomposed Agentic Processes. Complex objectives are broken into extreme, highly modular subtasks handled by narrow micro-agents, with a multi-agent voting scheme enacting real-time error correction at every decision node. The lesson is architectural: reliable long-horizon reasoning requires distributed, self-correcting systems rather than single powerful models.

Meanwhile, the approach to model scaling is becoming more surgical. The ATLAS study, presented at ICLR 2026 by researchers from Google and DeepMind, provides the first rigorous mathematical framework for multilingual model optimization, analyzing 774 training runs across models ranging from 10 million to 8 billion parameters in over 400 languages. This kind of precise, data-efficient scaling represents a departure from the brute-force compute expansion of previous years. It aligns with the emergence of new research laboratories like Flapping Airplanes AI, backed by $180 million from Sequoia, Google Ventures, and Index Ventures, which explicitly prioritizes fundamental algorithmic breakthroughs over cluster scale.

In multimodal generation, the progress is equally striking. OpenAI’s Sora 2 moved beyond the floating objects and physics-defying artifacts of its predecessor by implementing a rebuilt physics engine that natively understands fluid dynamics, gravity, and object weight, achieving 92% kinematic accuracy for complex human movements. New models from various companies, including Google’s Veo 3.1, Kuaishou’s Kling 3.0, ByteDance’s SeeDance 2.0, and Alibaba’s open-source Wan 2.6, are now designed to jointly handle visual frames and audio waveforms. This integrated approach enables them to create dialogue and ambient sounds that are natively synchronized and accurately produced from text prompts. And in 3D generation, tools like Nvidia’s LATTE3D and commercial platforms such as Meshy AI and Tripo now deliver topologically sound, production-ready meshes from text prompts in under twenty seconds, solving the “soup-like” geometry that made earlier outputs useless for professional workflows.

Where this leaves us

What strikes me most about the current moment is how poorly it maps onto a simple burst-or-boom narrative. The evidence supports something more nuanced and, frankly, more interesting: a K-shaped divergence in which the superficial application layer burns off while foundational capabilities continue to advance.

The hundreds of billions spent on generalized chatbot wrappers and poorly integrated pilot programs have largely evaporated. That correction is real, painful, and entirely warranted. Architectures built on unconstrained autonomy, lacking persistent memory frameworks and bounded safety metrics, are inherently unstable. When subjected to real-world complexity, phenomena like context compaction trigger what amounts to systemic amnesia. The 95% enterprise failure rate reflects this structural deficit.

Yet viewing this application-layer collapse as a total industry failure misreads the situation. The trajectory of foundational research is clearly speeding up. A 2026 Carnegie Mellon study on AI-assisted music generation captures the nuance well: generative models drastically increase production speed, even though the resulting compositions were measurably less creative and novel than unassisted human work. These systems have developed into exceptional engines of synthesis and rapid reproduction, but they do not yet generate genuine conceptual novelty without continuous human direction.

Educators, in particular, should recognize this difference. The technology is not collapsing; it is maturing violently. The speculative froth is burning away, and what remains will be more capable, more reliable, and more deeply integrated into professional and academic workflows. AI’s ability to reshape education radically is a foregone conclusion. The question is whether we engage with that disruption thoughtfully, understanding both what these systems can and cannot do, or whether we wait for the next wave to wash over us while we are still debating whether the last one was real.

The images in this article were generated with Nano Banana 2.

Share The Augmented Educator

P.S. I believe transparency builds the trust that AI detection systems fail to enforce. That’s why I’ve published an ethics and AI disclosure statement, which outlines how I integrate AI tools into my intellectual work.

The Augmented Educator

Discussion about this post

Ready for more?