Students Performing Human
Keystroke Surveillance and the Rise of Algorithmic Theater in Education
This post follows my standard early access schedule: paid subscribers today, free for everyone on May 19.
The first generation of AI detection tools promised schools a clean solution to a messy problem. Feed in a student essay, receive a verdict: human or machine. The technology relied on statistical proxies, measuring how predictable a text’s word choices were (perplexity) and how much its sentence length and structure varied (burstiness). From these metrics, the algorithms attempted to draw a line between human cognition and synthetic generation. It was an attractive proposition. It was also catastrophically unreliable.
I have traced this failure across several chapters of my book The Detection Deception, serialized here on this Substack, but the core findings bear restating because the problems plaguing these tools were, and remain, substantial. Standard AI detectors flagged genuine essays by non-native English speakers as AI-generated at rates exceeding 61%. In extreme cases involving structured writing for standardized assessments like the TOEFL, human-authored texts were falsely flagged nearly 98% of the time. MIT, Yale, and UC Berkeley banned or deactivated these tools. OpenAI quietly shut down its own AI classifier. The clear algorithmic boundary between human and machine writing turned out to be a fiction, and institutions that had staked their integrity policies on that fiction found themselves exposed.
The industry’s response to this collapse is as expected. Rather than reconsider whether automated authorship verification belongs in a classroom at all, the educational technology sector moved the surveillance upstream. If you cannot reliably determine who wrote the final product, then simply watch the writer produce it. Monitor the keystrokes. Track the pauses. Record the paste events. Then replay the entire drafting process. But watching someone type does not prove that they are thinking. The problem has simply migrated from the essay to the essayist.
This shift is happening quickly and with remarkably little critical scrutiny. Tools like Turnitin Clarity, GPTZero’s Chrome Extension, Grammarly Authorship, and Copyleaks have moved from analyzing what students submit to monitoring how they compose. The marketed language is “transparency” and “visibility into the writing process.” The operational reality, however, is continuous behavioral surveillance of the act of thinking itself.
Watching writers
The market for writing process trackers is broader and more technically varied than many educators realize. A reasonably complete list of tools with detailed descriptions of their capabilities and limitations appears in the appendix. These tools are worth mapping before evaluating, because while they differ in important ways, they converge on the same underlying logic when deployed in schools.
Turnitin Clarity is the dominant institutional product. Named to TIME’s Best Inventions list in 2025 and rolled out as a paid add-on to Turnitin Feedback Studio, Clarity provides what Turnitin calls a “one-stop composition space” integrated directly into a school’s learning management system. Students draft their assignments within Clarity’s environment, and the platform records everything: pasted text volume, total writing time, construction time, AI chat logs, and a full video playback of the drafting process from the first keystroke to the final submission. Instructors can scrub through this recording, watching text appear and disappear, flagging paste events and moments of rapid generation.
Turnitin’s own documentation describes this as giving educators “full visibility into the student writing process.” The framing is pedagogical. The mechanism, however, is surveillance.
Other vendors have followed suit, and the market is booming. GPTZero’s Chrome Extension takes a similar approach through Chrome, overlaying Google Docs with process tracking. It logs copy-paste events, revision duration, and collaborative edits, then packages the results as shareable PDF reports. Copyleaks blends static AI detection with real-time monitoring, displaying heat maps of phrases it considers characteristically AI-generated as the student types. Draftback, the simplest of the group, offers chronological playback of Google Docs edit histories. And Grammarly Authorship, which positions itself more as a provenance tool than a disciplinary one, labels text by source, distinguishes typing from pasting, and provides color-coded version histories.
Then there are the professional tools, designed for an entirely different context. OKhuman, built for newsrooms and freelancers, uses microphone inputs to capture the acoustic signature of physical keystrokes, cross-referencing that audio with operating system activity to generate a cryptographic "Made by Human" stamp. And Chain of Creation, developed by Human Intelligence, analyzes the metadata "plume" of creative activity while refusing to ingest the actual text, protecting intellectual property from being scraped into training datasets.
These tools differ from their educational counterparts in one decisive respect. The writer chooses to use them.
The pedagogical roots are real
Process tracking did not emerge out of nowhere. Writing studies has been interested in composition processes since at least the 1960s, when scholars like Janet Emig, Peter Elbow, and Donald Murray argued educators should teach the process of writing rather than evaluate the final product alone.




