The Augmented Educator

The Augmented Educator

How a Code Review Got Claude Fable 5 Banned

The security flaw that pulled the world's most capable AI model offline is a reminder of why we still need to teach students how to code.

Michael G Wagner's avatar
Michael G Wagner
Jun 25, 2026
∙ Paid
Upgrade to paid to play voiceover

On the evening of June 12, US authorities gave Anthropic ninety minutes to pull its most capable product, which it had released just three days earlier, off the internet. The company complied. By the end of the night, Claude Fable 5 had gone dark, along with its more powerful sibling, Mythos 5.

What makes this event so remarkable is the reasoning behind the deactivation. The government did not shut the model down because it wrote malware, or because someone tricked it into creating the blueprint for a weapon. It was shut down because it was too good at a task we very much want AI to do: reviewing code.

The thing it was punished for was the very thing it was built for.

The exact account is technically complex, but I think it is important for any educator to understand what happened. I therefore want to spend this essay providing a lay explanation that is accessible to non-technical readers. Because once you see the mechanism clearly, a lot of assumptions about AI safety start falling apart.

Two faces of one model

To understand the shutdown, you have to understand what these two models were. Anthropic had trained a single system, the most capable it had ever made, and then released it wearing two different faces.

Mythos 5 was the raw version of the model. It held nothing back, and because of that, Anthropic handed it to only a small, vetted group in what they call Project Glasswing. This involved approximately 150 organizations focused on tasks such as protecting vital infrastructure. The reasoning was simple. A model that can explain, in working detail, how to write attack software is extremely dangerous if it falls into the wrong hands.

Fable 5 was the public version of Mythos 5. It had the same underlying brain, but wrapped in an automatic safety layer. Simply put, it had a bouncer posted at its door.

The bouncer’s job was to read every request coming in and every answer going out, watching for three kinds of dangerous content: instructions for offensive hacking, instructions for building biological weapons, and attempts to expose the model’s hidden reasoning. When it caught one, it quietly handed the conversation off to Claude Opus 4.8, an older and tamer model, which would finish the job. This happened seamlessly. Most users would never notice the swap.

In principle, this was clever engineering. The safety layer let the company sell access to a genuinely powerful model while, in theory, keeping the most dangerous knowledge locked behind a door that only a vetted group of users could open. But the whole arrangement rested on a single assumption: that the bouncer could reliably tell a dangerous request from a harmless one.

And Fable 5 was powerful, in my own assessment too. I’ll spare you the benchmark tables, but one figure shows its capabilities. During testing, the payments company Stripe pointed Fable 5 at a fifty-million-line codebase and asked it to migrate the whole thing to a new framework. It finished in a day. Anthropic’s own estimate was that the same job would have taken a team of human engineers two months.

It is worth holding on to that number, because the ability that let Fable 5 rewrite fifty million lines of code in a single day is also the ability that got it banned.

The most ordinary request in the world

Here is where the trouble started. One of the most legitimate, everyday things you can ask a coding model to do is look over your code for mistakes. “Review this for security issues and edge cases” is a sentence typed thousands of times a day by developers. It is the bread and butter of the job.

User's avatar

Continue reading this post for free, courtesy of Michael G Wagner.

Or purchase a paid subscription.
© 2026 Michael G Wagner · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture