An Experiment in Language Laundering
What Humanizers Do to the Text We Ask Students to Submit
There is a small, unlikely confession I need to make: I have recently started to appreciate AI detectors.
This is not, I should clarify, because I think they are reliable tools for identifying AI-generated text. I have written at length in previous essays about why they are not, and nothing in my recent experience has changed that assessment. GPTZero, the detector I currently use, is considered one of the more accurate options on the market, but that is a relative distinction in a field defined by inaccuracy. GPTZero is still highly inaccurate, just slightly better than the rest. It remains unable to perform what most of its users think it can.
But to my surprise, I found out that detectors are genuinely useful for understanding why certain text reads as machine-generated. Not as instruments of surveillance, but as a diagnostic mirror that reflects back the stylistic fingerprints of AI-produced prose. It enables writers to understand what the machine default actually looks like and compare it with their own writing. And that, I believe, is pedagogically valuable.
The conversation that started this
The catalyst for my newfound appreciation for AI detectors was an exchange on LinkedIn. An educator shared her approach to such tools in the classroom. She did not use them for grading, she explained, but to show students why their writing appeared AI-generated. The pedagogical instinct was sound. The implementation, however, was not.
She checked students’ work only when she already “suspected” AI use.
This is where the logic collapses. The belief that experienced readers can reliably distinguish AI-generated text from human writing is one of the most persistent and most thoroughly debunked assumptions in the current academic integrity landscape. I have seen polished, stylistically distinctive human prose flagged as AI-generated, and I have seen AI output that reads with enough idiosyncrasy to pass as human. The idea that a teacher’s intuition can serve as a reliable preliminary filter is not just unsupported by evidence. It is contradicted by it.
The stakes compound the problem. An academic integrity accusation is not a minor inconvenience. It can result in course failure or even expulsion. When the decision to investigate rests on a subjective “sense” that something feels machine-written, we are subjecting individual students to severe consequences based on nothing more than an instructor’s pattern-matching instinct. The pressure this places on the accused is immense, even if the accusation is not factored into the grade. And it falls disproportionately on students whose natural writing style is formal, structured, or non-idiomatic, including, as research has repeatedly shown, non-native English speakers.
The implication is straightforward. If AI detectors are going to be used at all, they must be used indiscriminately. Every student’s work gets checked, or nobody’s does. Selective application based on suspicion is not a pedagogical strategy. It is profiling.
But this LinkedIn exchange did more than sharpen my objections to selective detection. It prompted me to ask a question I had so far neglected: what actually happens when someone tries to make AI-generated text undetectable?
The experiment
I am always open about my own AI use. Personally, I use Gemini and Claude for research and writing assistance. I disclose this publicly, as I strongly believe that transparency about AI use is essential. But I rarely bother running my work through detectors because I know what the result would be. My standard drafting workflow produces text GPTZero would almost certainly flag as entirely AI-generated. This does not concern me because I am not forced to submit my work under an ill-advised academic integrity policy. And in my opinion, the origin of the text, whether AI-assisted or not, is secondary to the originality of the ideas and the quality of the editorial judgment applied.
Still, I was curious. If a student wanted to take AI-generated text and convert it into something that passed as human-written, how difficult would that be? And what would be lost in the process?
I chose an unpublished draft essay of mine about the Axios supply chain hack as my test case. The essay is roughly 2,500 words of primarily technical analysis, the kind of evidence-heavy prose that AI detectors flag most aggressively. I ran it through GPTZero to confirm my expectations: 100% AI-generated, as predicted.
Then I fed it into Undetectable.ai, which is considered one of the best AI humanizers currently available. The premise of these tools is simple: they take a text that reads like AI output and rework it until detectors classify it as human. They are, in the most literal sense, language-laundering services. You feed in text stained with signs of machine origin, and the tool scrubs it until the provenance is no longer detectable.
The humanizer worked. The output came back from GPTZero rated as 100% human-written.
It also destroyed my essay.
I want to show you what this looks like in practice, because the scale of the damage is difficult to convey in the abstract. Here is a single paragraph from my original Claude-assisted draft, followed by what Undetectable.ai did to it:
The attackers, a subgroup of the Lazarus Group tracked by Google’s Threat Intelligence Group as UNC1069, did not exploit a flaw in the code itself. They targeted the human maintainer. How the attackers gained access to his account remains, as of this writing, undetermined; the maintainer himself has stated publicly that he had multi-factor authentication enabled on virtually everything. UNC1069 is known to employ AI-generated deepfake video and voice synthesis in its social engineering operations, but the specific vector used in this case has not been confirmed. What is confirmed is the result: the attackers seized control of the account and published two backdoored versions of Axios to the NPM package registry, injecting a hidden malicious dependency that silently installed a cross-platform Remote Access Trojan called WAVESHAPER.V2 on every system that downloaded or updated the library during a three-hour window. The malware granted the attackers full remote control of infected machines: the ability to execute commands, steal credentials, exfiltrate files, and map internal networks.
According to GPTZero, this text is fully generated by AI. This is not really accurate, as I did heavy manual editing, but that is a different story. And this is the result after running the entire essay through Undetectable.ai:
The people who attacked Axios didn’t hack the code. They hacked the human maintaining the code. We don’t know how the attackers gained access to the maintainer’s account, although he says he has multi-factor authentication on almost everything. What we do know is that the attackers used his account to release two versions of Axios to the NPM package registry, using a hidden dependency to install a cross-platform Remote Access Trojan in the packages released over the course of about three hours. The malware, dubbed WAVESHAPER.V2 by Google’s Threat Intelligence Group (and attributed to the Lazarus Group, which Google calls UNC1069), allows the attackers to issue commands, steal credentials, steal data, and map out the attacker’s internal network.
GPTZero determined this version to be 100% human.
But read that last line again. The humanizer changed “map internal networks,” to “map out the attacker’s internal network.” A Remote Access Trojan’s sole purpose is to gain access to and survey the victim’s network. The humanized version says the malware maps the attacker’s own network, which reverses the meaning entirely. In a 2,500-word essay about a cybersecurity incident, this is not a stylistic issue. It is the type of error that would destroy the author’s credibility with any technically literate reader.
That was not the only damage to this single paragraph. The humanizer also dropped a critical detail about UNC1069’s known use of AI-generated deepfake video and voice synthesis, information that the original included to foreshadow a key argument later in that essay. And it collapsed the distinction between UNC1069, the subgroup tracked by Google’s Threat Intelligence Group, and the Lazarus Group itself, while simultaneously mis-attributing the WAVESHAPER.V2 designation. Three errors in a single paragraph, each one sufficient to undermine the essay’s authority on its own.
Twenty-six ways to ruin an essay
I sat down and catalogued every meaningful error the humanizer introduced across the full 2,500-word essay. The final count was twenty-six distinct changes that altered, distorted, or reversed the meaning of the original text. They fell into patterns that show very effectively what humanizers actually do to prose.
The factual fabrications were the most alarming. The humanizer invented a source called “Enterprise Information Security Services (EISS)” that does not exist, fabricating an attribution for a claim the original had sourced more carefully. It referred to “the attack on OpenRedirectory,” a name that appears nowhere in my draft because it is not real. It attributed an observation to “Tom Wills of Microsoft,” a person who does not appear in my original text and who, as far as I can determine, does not exist. And it introduced a completely unrelated software tool into a technical discussion where it made no sense.
And then there was my personal favorite: the humanizer expanded the acronym LLM as “Language Loops Model Language.” Every educator reading this Substack has encountered the term “Large Language Model” thousands of times in the past two years. The humanizer got it wrong.
These are not minor inconsistencies. They are substantial hallucinations. The humanizer, in its effort to make text sound more human, invented facts.
The audience shifts were subtler but no less corrosive. My essay was addressed to educators. The humanizer systematically replaced this audience with information security professionals in several key passages, undermining the entire rhetorical framework. And it narrowed “all educational institutions” to “Universities and Colleges,” even though the surrounding paragraphs discuss K-12 districts extensively.
The humanizer also rewrote my intended tone. My original text described teachers who “did not enter the profession to teach cybersecurity” and “feel underqualified to address these topics,” a sympathetic framing. The humanizer rewrote this as teachers who “don’t want to teach cybersecurity,” implying active resistance rather than a reasonable mismatch between training and expectation. And my carefully constructed closing argument was replaced with “We are at war,” a militaristic escalation entirely absent from the original.
The cumulative effect was devastating. The essay that emerged from the humanizer bore my argument’s general shape but almost none of its precision. It was, in every measurable sense, worse: less accurate, less coherent, less persuasive, and less trustworthy. The laundering had removed the traces of AI origin, but it had also stripped away the substance.
How to pass anyway
I was not finished. I wanted to know whether the damage could be undone without re-triggering the detector. So I took the humanized text and my catalogue of its twenty-six errors, fed them both back into Claude, and asked it to correct the errors while making the minimum changes necessary to preserve the humanized text’s “human” character.
Here is the result for that same paragraph:
The people who attacked Axios didn’t hack the code. They hacked the human maintaining the code. We don’t know how the attackers gained access to the maintainer’s account, although he says he has multi-factor authentication on almost everything. The group is known for using AI-generated deepfake video and synthetic voices in their social engineering playbook, but the specific method they used against the Axios maintainer has not been confirmed. What we do know is that the attackers used his account to release two backdoored versions of Axios to the NPM package registry, injecting a hidden malicious dependency that silently installed a cross-platform Remote Access Trojan called WAVESHAPER.V2 on every system that downloaded or updated the library during a three-hour window. The malware, attributed to a Lazarus Group subunit that Google’s Threat Intelligence Group tracks as UNC1069, allows the attackers to issue commands, steal credentials, steal data, and map out the victim’s internal network.
GPTZero now identified the essay as 7% AI, 5% mixed, and 88% human. Not perfectly “human,” but human enough. Eighty-eight percent would sail through any institutional check. A student submitting this version would face no scrutiny, no accusation, and no integrity hearing. The laundering was complete.
The entire process, from original draft to humanized version to error-corrected final, took me less than an hour. A student with moderate technical literacy could manage it in about the same time, perhaps less, once they had practiced the workflow. The barrier to bypassing detection is not high. It is trivially low.
The incentive trap
I chose the title metaphor for a reason. Money laundering does not create value, it obscures provenance at a cost. Language laundering operates on the same principle. The humanizer does not improve text. It degrades it, systematically, in order to remove the statistical signatures that detectors rely on. What emerges is not better writing. It is writing that has been deliberately made less precise, less coherent, and less reliable so that it can appear more human.
Think about what this means for the students we claim to be educating. When institutions deploy AI detectors as enforcement mechanisms, they create a straightforward incentive. Students who use AI will seek ways to avoid detection. The most accessible method is a humanizer. But the humanizer, as my experiment shows, introduces errors, fabricates sources, reverses meanings, and strips nuance. The student submits the humanized text. If it passes the detector, the student receives credit for work that is objectively worse than what the AI originally produced, and dramatically worse than what the student might have written through genuine engagement with the material.
We are not catching cheaters. We are incentivizing them to cheat less competently.
To be fair, I do need to acknowledge the counterargument: the mere presence of a detector might deter some students from using AI at all, and deterrence has value even when imperfect. I do not dismiss this reasoning. But the cost of the deterrence approach, measured in false accusations and the active degradation of submitted work, is not hypothetical. My experiment makes those costs visible. The question is whether the deterrent effect justifies them. I do not believe it does.
A better use for the mirror
The educator on LinkedIn had the right instinct buried within the wrong method. AI detectors are genuinely useful as pedagogical instruments, not as policing tools. The distinction is structural, and it changes everything about how these tools interact with student learning.
Imagine an assignment where each student uses a tool such as GPTZero on their own writing, not for a pass/fail grade on integrity, but to understand the reasons behind the detector’s scoring. What patterns does the detector associate with AI-generated text? Where does the student’s own prose converge with those patterns, and where does it diverge? And what does it mean, stylistically and rhetorically, that a passage “reads like AI”?
This approach transforms the detector from a policing mechanism into a tool for rhetorical awareness. Students learn to recognize the default patterns of machine-generated prose. They can then make conscious decisions about their own style. They might choose to diverge from those patterns, cultivating a more distinctive voice. Or they might recognize that certain structural features of AI-generated text, such as clear topic sentences, logical progression, and explicit signposting, are genuinely useful and worth incorporating deliberately. None of this requires an integrity hearing. It requires a well-designed assignment.
When we use detectors for policing, students learn to launder. When we use them as mirrors, students learn to write.
The images in this article were generated with Nano Banana 2.
P.S. I believe transparency builds the trust that AI detection systems fail to enforce. That’s why I’ve published an ethics and AI disclosure statement, which outlines how I integrate AI tools into my intellectual work.









The language laundering metaphor is the sharpest thing in this piece, and not just because it's accurate. It reveals the structural absurdity of the whole detection approach: you have built a system that, when gamed, produces objectively worse work and then passes it. The incentive trap isn't a side effect. It's the inevitable outcome of measuring the wrong thing.
Which is what detectors do. They measure surface texture. Sentence rhythm, transition density, the statistical fingerprints of a particular generation style. None of that is thinking. None of that is the idea, the argument, the editorial judgment that decided which facts matter and in what order. A humanizer can scrub the fingerprints because the fingerprints were never the point.
The mirror use case you propose at the end is genuinely interesting — and probably the only pedagogically honest use for these tools. But I'd add one complication: it requires rebuilding trust that the policing approach has already spent. Ask a student to run their own work through a detector as a learning exercise in an institution that uses the same tool punitively, and you're asking them to lower their guard in a room they've learned is hostile. The pedagogy is sound. The precondition is hard.
The real question the piece keeps circling without quite landing on: what are we actually trying to teach? If it's writing as a craft of thinking — argument, precision, the judgment to know which detail changes everything — then no detector touches that. You read the work. You ask questions about it. You find out very quickly whether anyone thought it.