The Snake That Eats Its Own Tail

The detector is just the chatbot's twin. Stop pointing it at our students.

Michael G Wagner

May 29, 2026

Article voiceover

0:00

-14:26

I am frustrated. Very frustrated. And I need to vent a little, so bear with me.

Over the last few weeks, the use of AI detectors has been pushed back into the spotlight. It appears to have started with Pangram and its relentless marketing, which claims the company’s tool is accurate enough to reliably identify text that is AI-generated or even AI-assisted. But as I argued in an earlier essay, this is pure fiction.

Suppose the detector really hit the 99.98% accuracy Pangram advertises, a figure I doubt holds up anywhere outside a laboratory. Even then, two in every ten thousand texts would be falsely flagged as machine-written. But students don’t submit just one essay across their school years. They hand in a great many, term after term. So the share of students who would get falsely accused at least once climbs well past that headline number, even at an accuracy rate that sounds almost flawless.

An accusation of an academic integrity violation is not a small thing. It can and often will follow a student well beyond the assignment and the course, into the rest of their academic life. It is literally a career-threatening accusation. Turning AI detectors into a surveillance system in the classroom thus does actual damage to the mental health of the young people in our care.

I have been an educator for over thirty-five years. Whatever I teach, and however I teach it, I have always believed that the well-being of my students sits at the center of my work. Nothing justifies harming them. Absolutely nothing. I sincerely hope this is not a controversial position.

Let me be as direct as I can, so that everyone can hear me.

If you decide that the constant threat of a false, life-altering accusation is a fair price for your honest students to pay so you can catch a handful of cheaters, then whatever is happening in your classroom has stopped having anything to do with education. No academic integrity standard is worth doing that to a human being.

The story that broke a prize

Another reason AI detection is back in my feeds has a rather unlikely origin.

In May, the Commonwealth Short Story Prize, run out of London by the Commonwealth Foundation, announced its regional winners and published the stories in Granta, the British magazine that has carried serious fiction for decades. The Caribbean winner was a story called The Serpent in the Grove, credited to a Trinidadian writer named Jamir Nazir. Judges loved it. The panel chair, the novelist Louise Doughty, praised its restraint and quiet authority. And the Caribbean judge, Sharma Taylor, called its voice melodic.

Then the internet got hold of it.

Within days of the story going up on Granta’s site, a loose crowd of researchers and online readers ran it through AI detectors. Pangram flagged it as 100% AI-generated. Grammarly agreed. QuillBot agreed as well. GPTZero, oddly, called it entirely human, which already tells you something about how steady these tools are. And Ethan Mollick, the Wharton professor and AI expert, described the episode as a kind of Turing test, a machine-written text that had walked straight past elite human judges at the top of the literary world.

Granta asks the snake to identify its own tail

On May 18, Granta’s publisher, Sigrid Rausing, put out a public letter to address the situation. Part of it was ordinary damage control. Granta had no say in picking the winners, she explained, and no role in choosing the jury. Fair enough. She then described how the magazine had investigated the allegation. Granta did not bring in a forensic linguist. It did not ask Nazir for his draft history. Instead, it took the text of The Serpent in the Grove and fed it into Claude, Anthropic’s chatbot, and asked the machine whether the story had been written by a machine.

Claude obliged. It produced a long, hedged answer, concluding that the story was almost certainly not written by a human working alone. It floated a hybrid theory: a human core reshaped by a language model. Claude even pointed to a couple of passages that it judged too oddly specific to be pure machine output. Based on that reading, Rausing landed on a position of weary agnosticism. The judges may have crowned an act of AI plagiarism, she admitted, and we may never know for sure.

The next line in her letter caused a major controversy. “There is a certain irony,” she wrote, “in the fact that beyond human hunches, AI is the most efficient tool we have for identifying what AI has generated.”

I want to be generous here, because I think Rausing knew exactly what she was saying. The word ‘irony’ is carrying a lot of weight in that sentence. I read it as a sarcastic stance, almost a joke at her own expense, rather than a serious method she was recommending. But most did not read it that way. One critic called the decision to consult Claude an act of moral and intellectual cowardice. Another called it astonishing. Reddit’s literary corners were less polite, and the general verdict was that a magazine which outsources its judgment to a chatbot has admitted it no longer trusts its own editors to read.

But hidden inside Rausing’s ironic wording is something real, and it is the thing I actually want to talk about. She had reached for an AI to catch an AI. She had asked the snake to identify its own tail.

The literacy charge cuts both ways

A common reaction to Rausing’s letter was that she had made a serious misjudgment. Claude is a text generator, and a purpose-built detector like Pangram, a text classifier, has a completely different system architecture. Several commentators framed the whole Granta episode as a failure of AI literacy by an old institution that had not kept up with the technology.

I think this reading gets the literacy problem backward. The belief that a text generator and a text classifier are fundamentally different is in itself a misunderstanding. They are certainly not equivalent, but they are close cousins.

Both are built on the same family of neural networks, the transformer architecture that has driven nearly everything in this field for close to a decade. Each one breaks sentences into small pieces called tokens, then turns those tokens into long strings of numbers that map the text’s meaning. The numbers travel through the network’s many layers until an answer falls out the far end. No text is stored and no grammar rulebook is consulted. And each one is, at heart, a probability machine.

That last word is the one that is most important: probability.

When people call a detector ‘accurate’, they reach for the language of measurement, as if the tool were a thermometer. A thermometer is deterministic. Put it in the same water twice and you get the same reading, because it obeys fixed physical laws. A text classifier does nothing of the kind. It estimates. It hands back a number between zero and one that expresses how confident it is that a passage came from a machine.

And that estimate rests on the same statistical foundations as the chatbot that may or may not have produced the text in the first place. Rausing might have been clumsy in the way she wrote her response, but she was not confused about the underlying technology. If anything, she understood the similarities better than the people attacking her did.

What a detector actually does

I need to acknowledge that Pangram is a serious piece of engineering, built by former engineers from Google and Tesla’s self-driving program. Its model, which the company calls EditLens, has been trained with two genuinely clever techniques. One, called mirror prompting, generates an AI ‘twin’ for every human document so the system is forced to learn the fingerprints of machine writing instead of the topic of the essay. The other, hard negative mining, hunts down the human texts that the model gets wrong and feeds them back in to improve the system.

And it does work to some extent, at least on the terms it sets for itself. The most careful independent review came from two economists at the University of Chicago, Brian Jabarian and Alex Imas, in a 2025 working paper. They built a balanced corpus of nearly two thousand human passages and nearly two thousand machine ones, drawn from genres as varied as news articles and novels. Pangram beat its commercial rivals by a wide margin. I will not pretend that those results do not exist. They do, and they are, quite frankly, stronger than I would have expected not that long ago.

But here is where the system betrays itself. Pangram’s whole method is to estimate how evenly polished a piece of writing is. The smoother the prose, the more confident the machine becomes that no human wrote it. Some researchers have a name for the failure that this produces. They call it the polish penalty.

The polish penalty is exactly what it sounds like. A student who writes in clean, regular sentences looks to the detector like a machine. So does a careful writer who edits hard. This is also true for someone learning English as a second language, who relies on correct, common grammar because of their education rather than lifelong immersion in the language.

One author, Sharon Aruparayil, a young Indian writer whose story was incorrectly flagged as partly AI-generated, fought back. She produced years of time-stamped drafts proving she had written every word herself, then ran her own prose through a detector and watched it get flagged anyway. To force a ‘fully human’ verdict out of the machine, she needed to vandalize her own style. She broke up her sentence patterns on purpose and dropped in a few words of Marathi until the algorithm was satisfied.

A real author degraded her real writing to please a probability score. That is the world AI detection builds. A guess made by software became a verdict the author had to disprove, and the only way to satisfy it was to make her own work worse.

Yes, text generators and text classifiers have different system architectures. The plumbing is different. But underneath, the paradigm is the same. LLMs and AI detectors are both statistical engines that return probabilities rather than facts, and both will hand you a different answer if you nudge the input in ways no human reader would ever notice.

And anyone can walk straight past it

The unreliability alone would be reason enough to put these tools down, but there is a second problem. AI detectors are incredibly easy to bypass. An entire industry has grown up to defeat detectors, with tools like undetectable.ai that take machine-written text and rough it up, chopping long sentences into short ones and adding the kind of irregularity a classifier reads as human.

And yes, Pangram trains on the output of these humanizers to account for this challenge. But the humanizers train against Pangram in return. The wheel turns and never stops, and the people selling detection know it.

I experimented with humanizers myself in my earlier essay, An Experiment in Language Laundering. I fed AI-written text through undetectable.ai and watched it sail past every single detector that had flagged it moments before. The bypass works. Flawlessly. But it comes at a price: the laundered prose turned out measurably worse. That tradeoff tells you what the detector is actually measuring. It registers the surface texture of prose and nothing of substance beneath it.

What we owe our students

I have made this point before. Detection tools are estimators, and an estimate handed down as a verdict will always misfire on someone.

This misfire lands hardest on the students least able to survive a false accusation. It corrodes the mental health of young people who already grow up under more surveillance than any generation before them. And worst of all, students determined to cheat can outsmart AI detection in minutes, while the ones who get flagged are too often the most honest. There is no version of this where the harm is worth the catch.

The impulse behind detection is one I understand completely. We want to know whether our students are learning or outsourcing. We want the work in front of us to mean something. That desire is legitimate, and I share it. The mistake lies in believing a machine can answer the question for us by policing the output.

It cannot, because the question was never really about the output. Education does not exist to fill students with information, as though they were containers waiting at the end of a pipe. Paulo Freire called that the banking model and warned us about it over fifty years ago.

Our job is to develop a human being who can think. A clean essay produced by a frightened student gaming a detector teaches that student nothing. But ten honest minutes spent talking about how they actually used the tool teaches them more than any scan ever could.

That conversation is the part I keep coming back to, because in my experience it simply works. When I drop the threat of an integrity charge and ask my students, plainly and without pressure, how they used AI in a piece of work, they tell me. We talk about what the tool did well and where it quietly misled them. I learn more about their thinking in these conversations than a detector could ever give me in a thousand scans.

In my classroom, no one has to defend their humanity to an algorithm. And no one has to break up their own sentences to prove they are real.

Granta asked the snake to name its own tail, and the snake, being a snake, gave the most likely answer. We can keep pointing these machines at one another, tightening the coil until it is the honest students who can no longer breathe. Or we can put the detector down and talk to the young people in front of us. Because in the end, it is their minds we were hired to care for.

A snake chasing its own tail catches nothing.

The images in this article were generated with Nano Banana 2.

P.S. I believe transparency builds the trust that AI detection systems fail to enforce. That’s why I’ve published an ethics and AI disclosure statement, which outlines how I integrate AI tools into my intellectual work.

The Augmented Educator

Discussion about this post

Ready for more?