Students Performing Human
Keystroke Surveillance and the Rise of Algorithmic Theater in Education
The first generation of AI detection tools promised schools a clean solution to a messy problem. Feed in a student essay, receive a verdict: human or machine. The technology relied on statistical proxies, measuring how predictable a text’s word choices were (perplexity) and how much its sentence length and structure varied (burstiness). From these metrics, the algorithms attempted to draw a line between human cognition and synthetic generation. It was an attractive proposition. It was also catastrophically unreliable.
I have traced this failure across several chapters of my book The Detection Deception, serialized here on this Substack, but the core findings bear restating because the problems plaguing these tools were, and remain, substantial. Standard AI detectors flagged genuine essays by non-native English speakers as AI-generated at rates exceeding 61%. In extreme cases involving structured writing for standardized assessments like the TOEFL, human-authored texts were falsely flagged nearly 98% of the time. MIT, Yale, and UC Berkeley banned or deactivated these tools. OpenAI quietly shut down its own AI classifier. The clear algorithmic boundary between human and machine writing turned out to be a fiction, and institutions that had staked their integrity policies on that fiction found themselves exposed.
The industry’s response to this collapse is as expected. Rather than reconsider whether automated authorship verification belongs in a classroom at all, the educational technology sector moved the surveillance upstream. If you cannot reliably determine who wrote the final product, then simply watch the writer produce it. Monitor the keystrokes. Track the pauses. Record the paste events. Then replay the entire drafting process. But watching someone type does not prove that they are thinking. The problem has simply migrated from the essay to the essayist.
This shift is happening quickly and with remarkably little critical scrutiny. Tools like Turnitin Clarity, GPTZero’s Chrome Extension, Grammarly Authorship, and Copyleaks have moved from analyzing what students submit to monitoring how they compose. The marketed language is “transparency” and “visibility into the writing process.” The operational reality, however, is continuous behavioral surveillance of the act of thinking itself.
Watching writers
The market for writing process trackers is broader and more technically varied than many educators realize. A reasonably complete list of tools with detailed descriptions of their capabilities and limitations appears in the appendix. These tools are worth mapping before evaluating, because while they differ in important ways, they converge on the same underlying logic when deployed in schools.
Turnitin Clarity is the dominant institutional product. Named to TIME’s Best Inventions list in 2025 and rolled out as a paid add-on to Turnitin Feedback Studio, Clarity provides what Turnitin calls a “one-stop composition space” integrated directly into a school’s learning management system. Students draft their assignments within Clarity’s environment, and the platform records everything: pasted text volume, total writing time, construction time, AI chat logs, and a full video playback of the drafting process from the first keystroke to the final submission. Instructors can scrub through this recording, watching text appear and disappear, flagging paste events and moments of rapid generation.
Turnitin’s own documentation describes this as giving educators “full visibility into the student writing process.” The framing is pedagogical. The mechanism, however, is surveillance.
Other vendors have followed suit, and the market is booming. GPTZero’s Chrome Extension takes a similar approach through Chrome, overlaying Google Docs with process tracking. It logs copy-paste events, revision duration, and collaborative edits, then packages the results as shareable PDF reports. Copyleaks blends static AI detection with real-time monitoring, displaying heat maps of phrases it considers characteristically AI-generated as the student types. Draftback, the simplest of the group, offers chronological playback of Google Docs edit histories. And Grammarly Authorship, which positions itself more as a provenance tool than a disciplinary one, labels text by source, distinguishes typing from pasting, and provides color-coded version histories.
Then there are the professional tools, designed for an entirely different context. OKhuman, built for newsrooms and freelancers, uses microphone inputs to capture the acoustic signature of physical keystrokes, cross-referencing that audio with operating system activity to generate a cryptographic "Made by Human" stamp. And Chain of Creation, developed by Human Intelligence, analyzes the metadata "plume" of creative activity while refusing to ingest the actual text, protecting intellectual property from being scraped into training datasets.
These tools differ from their educational counterparts in one decisive respect. The writer chooses to use them.
The pedagogical roots are real
Process tracking did not emerge out of nowhere. Writing studies has been interested in composition processes since at least the 1960s, when scholars like Janet Emig, Peter Elbow, and Donald Murray argued educators should teach the process of writing rather than evaluate the final product alone.
Their methods for observing that process were deliberately low tech and student-centered: think-aloud protocols, peer workshops, physical journals, and reflective portfolios. The writer remained in control of the disclosure. Proponents of today’s digital tools argue, with some justification, that keystroke logging and version history playback represent the technological realization of that same philosophy. If an instructor can see where a student pauses, struggles, and abandons ideas, they can offer formative feedback targeted to the student’s developmental needs. But the distance between a reflective portfolio and a keystroke log is vast.
One invites a student to narrate their own thinking. The other records it without asking.
Keystroke logging has generated genuine insights in writing research, particularly around how second-language writers navigate drafting differently from native speakers. But as I argued in my earlier essay “Proof of Work: The Radical Act of Showing Your Mess,” the pedagogical value of making the writing process visible depends entirely on whether students document their own thinking voluntarily or not.
Why surveillance is the wrong model for education
A student choosing to document their process and an institution requiring that every keystroke be logged are fundamentally different situations. When that difference is ignored, the consequences for learning are severe.
Writing becomes performance. Once students know that their typing speed, pause duration, and revision patterns will be scrutinized by an algorithm, the cognitive task splits in two. They are no longer engaged solely in synthesizing ideas. They are simultaneously performing “authentic human writing behavior” for an observer. This is the phenomenon I call algorithmic theater: composition reduced to the appearance of composition.
A student who prefers to draft by hand or think for long stretches before writing in rapid bursts must now slowly retype their own work into the surveilled environment, generating a keystroke log that looks sufficiently human. As I explored in “An Experiment in Language Laundering,” AI detectors already push students toward worse writing through misaligned incentives. Process trackers extend that same logic to the writing process itself, rewarding legible compliance over genuine thinking.
Anxiety displaces learning. Psychological research on social facilitation shows that being observed degrades performance on complex cognitive tasks. An erratic typing speed, an extended pause, or the pasting of a paragraph from a personal notebook might trigger an integrity investigation. That awareness introduces a secondary cognitive load that competes with the primary task of intellectual synthesis.
Students have reported deliberately introducing typographical errors and syntactic awkwardness to ensure their work appears adequately “human” to the algorithm. This is not integrity. It is theater. And it produces writing that is objectively worse than what students would generate without the surveillance.
The tools discriminate by design. Every process tracker operates on an implicit model of what “normal” human writing looks like: a steady accumulation of typed characters with naturalistic pauses and moderate revisions. That model encodes a neurotypical, native-English-speaking baseline and penalizes everyone who deviates from it.
I have repeatedly made the point that non-native English speakers already face disproportionate false-positive rates from static AI detectors. Process trackers deepen that penalty. The more a student’s English proficiency improves and their grammatical structure tightens, the more likely they are to be flagged. Neurodivergent students face a parallel problem. Those with ADHD may exhibit long pauses followed by rapid bursts of composition, and those with autism may display an unusually consistent typing rhythm. Both patterns trigger temporal anomalies in tracking systems.
The problem deepens for students who rely on assistive technology. Voice-to-text tools like Dragon, for instance, produce text that appears to process trackers as instantaneous large-block paste events with zero construction time. The system flags these events exactly as it would flag AI-generated content. Yet these are students whose right to use assistive technology is guaranteed by federal law. Forcing them to continuously disclose their disability and defend their reliance on legally mandated accommodations against an algorithm’s suspicion is humiliating and potentially unlawful.
As I argued in “The Surveillance Impasse,” the detection model systematically discriminates against the populations it should protect most. Process tracking deepens that discrimination. When a system treats assistive technology use or neurodivergent composing rhythms as suspicious, it has ceased to function as an integrity tool.
The tools themselves, however, are not the problem. The coercive context is.
When authorship verification makes sense
Outside the classroom, in the professional writing market, the logic changes entirely. The Authors Guild launched its “Human Authored” certification program in beta in January 2025 and expanded it to all U.S.-published authors in March 2026. Over 3,000 authors have certified over 5,000 titles, affixing a trademarked seal to their book covers.
The reason is economic. Up to 95% of clients now ask freelance writers for proof that their work is human-generated. Writers potentially face withheld payments or lost contracts when third-party AI detectors falsely flag their work. For a freelance copywriter whose income depends on satisfying search engine algorithms that increasingly penalize synthetic content, a verifiable log of the writing process is a competitive advantage.
This is where tools like OKhuman and Chain of Creation find their purpose. Rather than embedding themselves in managed institutional platforms, they let writers generate cryptographic proof of human authorship while keeping their text entirely off company servers. Chain of Creation certifies human origin through provenance metadata without ever ingesting the manuscript. The architecture around these tools is designed around the writer’s autonomy: the author controls the data, owns the proof, and can walk away at any time.
Chain of Creation takes an equivalent approach through provenance metadata, analyzing the “plume” of creative activity. In each case, the writer opts in. Similarly, an independent author using Grammarly Authorship to generate a revision history report for a client is making a voluntary economic choice. If the tool disrupts their workflow, they can uninstall it or find alternatives.
A university student mandated to compose exclusively within Turnitin Clarity’s tracked environment, on pain of a failing grade, has no such agency. The same technology becomes profoundly coercive when applied across that power asymmetry.
What educators should do instead
The pattern here will be familiar to regular readers. In “A History of Academic Dishonesty,” I traced how every previous disruption to academic integrity—from essay mills to internet plagiarism—prompted the same institutional reflex: build higher walls rather than stronger foundations. The pivot from static AI detection to keystroke surveillance is the latest iteration of that reflex. It will fail for the same reasons its predecessors did.
The alternative is pedagogical redesign grounded in dialogic education, an approach I explored in The Detection Deception. Oral defenses and metacognitive reflection make AI substitution difficult because they require the student to be genuinely present in the exchange. In my essay “Proof of Work,” I extended this logic further, proposing that students borrow from digital artists and become voluntary documentarians of their own thinking. The principle is the same: authentic evidence of learning starts with the student.
Some of these tools, in the right context, serve legitimate purposes. Professional writers navigating an economy saturated with synthetic content deserve mechanisms to certify their authorship. But when educators adopt these same tools as shortcuts to academic integrity, they trade trust for monitoring and learning for compliance.
If education responds to generative AI by watching students more closely instead of teaching them more thoughtfully, it will have protected neither integrity nor learning. The performance will be flawless. The theater will be empty.
Appendix: Authorship verification tools
Turnitin Clarity is a paid add-on to Turnitin Feedback Studio, targeted at K-12 and higher education institutions. It provides a dedicated writing environment within the school’s learning management system, recording pasted text volume, total writing time, construction time, AI chat interactions, and a full video playback of the drafting process. Instructors can set assignment-specific AI usage policies and review integrity insights alongside similarity and AI writing reports. The platform is cloud-based, with data visible to administrators, and it requires students to compose within its tracked environment, making it the most comprehensive—and most invasive—institutional surveillance tool currently available.
GPTZero Writing Report (Origin) is a Chrome extension that overlays Google Docs with process tracking capabilities, aimed at both the education and publishing markets. It generates a writing replay video that reconstructs the drafting process chronologically, logging the writing activity timeline, largest copy-paste events, average revision duration, and collaborative edits. Reports can be exported as shareable PDFs or web links, and GPTZero claims SOC2 and FERPA compliance. The replay function visualizes how a document was assembled, but the tool’s orientation remains primarily evidentiary, aimed at establishing whether a human produced the work.
Grammarly Authorship serves a broader user base across both academic and professional contexts, positioning itself more as a provenance tool than a disciplinary one. It automatically labels text by source, distinguishing content that was typed from content that was pasted, and tracks active writing sessions, time spent, and specific external sources used. Data is encrypted on-device using AES-256 GCM, with server retention limited to twenty-four hours for processing unless the user generates a shareable link, which extends retention to twelve months. Its tone is less punitive than Turnitin Clarity, but in institutional settings it still normalizes continuous process tracking as a condition of academic trust.
Copyleaks maintains a strong foothold in static AI detection while integrating real-time monitoring through browser extensions and LMS plugins. Its detection engine analyzes frequency ratios, syllable dispersion, and part-of-speech syntax, overlaying results as a heat map of phrases it considers characteristically AI-generated. The system continuously updates its machine learning models against a training corpus of billions of documents. Copyleaks is effective at identifying specific AI-like phrasing patterns, but its blended approach—combining static detection with behavioral monitoring—inherits the false-positive vulnerabilities of both methodologies.
Draftback is a Chrome extension that reconstructs the edit history of Google Docs, allowing instructors to play back the chronological sequence of additions, deletions, and revisions. It is the simplest tool in this category, offering document-level archaeology without the biometric or temporal analytics of its competitors. Draftback operates entirely within the Google ecosystem and has shifted toward a paid subscription model. Its limitations are significant: it captures only what happens inside Google Docs, missing any offline drafting, and its playback provides raw edit data without interpretive analysis.
OKhuman is designed for newsrooms, journalists, and freelance writers who need to certify human authorship for professional purposes. It uses microphone inputs to capture the acoustic signature of physical keystrokes, cross-referencing that audio with operating system data, time spent, and active application monitoring to generate a cryptographic “Made by Human” verification stamp. Audio is processed locally on the writer’s device to filter out background voices, and only metadata and the verification stamp are transmitted to OKhuman’s servers; the actual text is never recorded or stored. The tool is privacy-preserving by design, but its reliance on acoustic monitoring makes it unsuitable for writers who use assistive technologies, silent keyboards, or shared workspaces.
Chain of Creation (Human Intelligence) offers a provenance-based “Proof of Human” certification for professional authors and commercial creators. It analyzes the metadata “plume” of creative activity—biometric identity verification and non-textual markers of human effort—without ever ingesting the actual manuscript into a central database. This non-ingestion architecture is its defining feature: by separating verification telemetry from intellectual property, it ensures that unpublished work cannot be scraped to train competing AI models. The system is aimed at authors who need to certify their work’s human origin while retaining full control over their proprietary content, but its biometric identity requirements raise privacy questions of their own.
The images in this article were generated with Nano Banana 2.
P.S. I believe transparency builds the trust that AI detection systems fail to enforce. That’s why I’ve published an ethics and AI disclosure statement, which outlines how I integrate AI tools into my intellectual work.







