The Paradox of AI Detection
Why We Cannot Use Probabilistic Tools to Police Probabilistic Systems
This post follows my standard early access schedule: paid subscribers today, free for everyone on December 30.
Since I began arguing that AI detection represents a technological escalation to a problem requiring pedagogical solutions instead, I have received a steady stream of messages. Some come from educators genuinely wrestling with how to identify AI-generated work in their classrooms. Others arrive as thinly disguised marketing pitches: “Have you tried [Product X]? It’s 99% accurate.” Each message, whether sincere or commercial, rests on the same premise. It assumes that somewhere out there is a detection tool capable of reliably distinguishing human from machine text.
I understand the appeal of this hope. The emergence of large language models has upended familiar assessment practices, and the desire for a technological fix is natural. But my resistance to AI detection tools is not simply pragmatic skepticism about current products. It stems from something much more fundamental: a theoretical impossibility rooted in the very nature of the systems we are trying to detect.
Large language models are probabilistic systems. They generate text through controlled randomness, selecting each word based on statistical likelihoods rather than fixed rules. This non-deterministic nature is precisely what allows them to produce fluid, varied, human-like writing. Yet, this same characteristic creates an insurmountable problem for detection. Any tool designed to identify the output of a probabilistic system must itself operate probabilistically. And here lies the fatal contradiction: in high-stakes educational contexts where false accusations can permanently damage a student’s academic career, probabilistic detection is fundamentally inadequate.
It is important to emphasize that this is not a problem waiting for a technological solution. It is a theoretical limit as binding as the laws of mathematics themselves.
When Systems Roll the Dice: Understanding Non-Determinism
To grasp why detection fails, we must first understand what makes large language models distinct from earlier text-processing technologies. When you run a traditional plagiarism checker like Turnitin’s original plagiarism detection, you engage a deterministic system. Give it the same document twice, and it produces identical results. It compares strings of text against a database, looking for exact or near-exact matches. The process is mechanical, predictable, and reproducible.
Large language models operate on entirely different principles. They do not retrieve pre-written answers from a database. Instead, they predict the next word in a sequence based on statistical patterns learned from vast training data. At each step in generating text, the model calculates probability distributions across its entire vocabulary. The word “sky” might be followed by “blue” with a probability of 40%, “is” with 25%, “above” with 15%, and thousands of other possibilities with decreasing likelihood.
If the model simply chose the highest probability option at each step, its output would be repetitive and mechanical. To produce varied, creative text, it introduces controlled randomness through parameters like temperature and nucleus sampling. Temperature flattens or sharpens the probability distribution. Low temperature produces conservative, predictable text, while high temperature yields more creative variations. Nucleus sampling dynamically adjusts which words the model considers at each step, cutting off unlikely options while preserving meaningful diversity.
The consequences for detection are profound. A student can generate ten different essays from the same prompt simply by adjusting these parameters or clicking “regenerate.” There is no single “AI signature” to detect. The statistical fingerprint of the text shifts with each generation, creating a moving target that no fixed detection algorithm can reliably track.
Traditional plagiarism detection worked because it sought exact matches in a deterministic system. By contrast, AI detection seeks patterns in a system designed specifically to vary its patterns.
The Mechanics of Detection: Reading Statistical Tea Leaves
AI detectors typically rely on two primary metrics: perplexity and burstiness. Understanding these measures reveals both how detection works and why it fails.
Perplexity measures how surprising a piece of text is to a language model. Low perplexity means the text is predictable. Each word follows naturally from what came before according to the model’s training. High perplexity indicates unpredictability: unusual word choices, creative leaps, or contextual knowledge the model lacks. Detectors operate on the assumption that AI-generated text exhibits low perplexity (because the AI chose statistically probable words) while human text shows higher perplexity (because humans don’t always select the most likely next word).
Burstiness measures the variation in sentence structure and complexity throughout a document. AI-generated text often displays uniform sentence patterns, such as similar lengths, consistent grammatical structures, or predictable rhythm. Human writers naturally vary their cadence, mixing short declarative sentences with longer complex constructions. Detectors flag low burstiness as potentially artificial.
Some detection systems also utilize watermarking, a more sophisticated approach that embeds hidden statistical signals during text generation. The model divides its vocabulary into “green list” and “red list” words based on cryptographic hashing, then preferentially selects green list words. To a human reader, the text appears normal. To a detector with the correct algorithm, an improbably high frequency of green list words reveals the watermark’s presence.
This brief summary understates the technical complexity involved, but the core principle remains: all detectors search for statistical patterns they associate with machine generation. Whether analyzing perplexity, burstiness, watermark frequencies, or something else, they make probabilistic judgments about authorship based on textual features.
The Theoretical Trap: Why Detection Must Fail
Here we reach the heart of the problem. The theoretical limits constraining AI detection are not temporary obstacles that better engineering will overcome. They are fundamental, insurmountable properties of the systems involved.
Keep reading with a 7-day free trial
Subscribe to The Augmented Educator to keep reading this post and get 7 days of free access to the full post archives.



