The Surveillance Impasse
The Detection Deception, Chapter 3
Fellow Augmented Educators,
Welcome to week three of ‘The Detection Deception’ book serialization. This week’s installment, ‘The Surveillance Impasse,’ documents the disastrous institutional response to the generative AI revolution from the panic-driven adoption of flawed detection tools to the unwinnable technological arms race that followed.
Last week’s Chapter 2 covered the long history of academic dishonesty. This week explores what happened when those historical vulnerabilities met the exponential force of AI, creating the pedagogical and technological stalemate we face today.
Thank you for reading along! See you in the comments.
Michael G Wagner (The Augmented Educator)
Contents
Chapter 1: The Castle Built on Sand
Chapter 2: A History of Academic Dishonesty
Chapter 3: The Surveillance Impasse
Chapter 4: Making Thinking Visible
Chapter 5: The Banking Model and Its Automated End
Chapter 6: Knowledge as a Social Symphony
Chapter 7: A Unified Dialogic Pedagogy
Chapter 8: Asynchronous and Embodied Models
Chapter 9: Dialogue Across the Disciplines
Chapter 10: The AI as a Sparring Partner
Chapter 11: Algorithmic Literacy
Chapter 12: From the Classroom to the Institution
Chapter 3: The Surveillance Impasse
In the spring of 2023, educational institutions worldwide found themselves caught in a peculiar form of technological warfare. Companies specializing in AI detection promised algorithmic solutions, whereas evasion tools openly boasted about their ability to bypass these systems. Teachers discovered that the same students who struggled with basic paragraph construction could suddenly produce sophisticated essays that mysteriously passed all detection systems. Meanwhile, honor students watched their carefully crafted work get flagged as artificial while obvious ChatGPT outputs sailed through undetected.
This was not a battle between good and evil, or even between teachers and cheaters. It was something more absurd and more tragic: an entire educational system pouring resources into technologies that canceled each other out, leaving everyone exhausted and nothing resolved. The surveillance infrastructure that was supposed to preserve academic integrity had instead spawned its own opposition, creating a perpetual conflict where the only certainty was that nobody could be certain of anything. What follows is an examination of this technological stalemate, its participants, and the fundamental impossibility of winning a war where every weapon strengthens its counter-weapon, where victory conditions cannot be defined, and where the battlefield itself—human learning and assessment—suffers the real casualties.
The Great Democratization: AI as the Ultimate Accelerant
In November 2022, OpenAI released ChatGPT to the public. Within five days, it had acquired one million users. Within two months, it had reached one hundred million. No consumer technology in history had achieved such rapid adoption. For education, this moment represented not evolution but revolution—a fundamental discontinuity in the long history of academic assessment. The change was so swift, so complete, that many educators reported feeling as though they had gone to sleep in one world and awakened in another. The careful equilibrium between trust and verification, between authentic work and substituted work, between learning and credentialing, shattered overnight.
To understand the magnitude of this disruption, we must first grasp what makes generative AI qualitatively different from all previous forms of academic dishonesty. The cheating economy we have traced, from nineteenth-century essay mills to internet plagiarism, operated within certain fundamental constraints. Someone, somewhere, had to do the actual intellectual work. Whether it was a paid writer in Kenya crafting custom essays or a student copying paragraphs from websites, human intelligence remained essential to the process. Even plagiarism, at its core, involved human judgment about what to copy, how to arrange it, and how to modify it to fit the assignment. Generative AI obliterates this constraint. For the first time in history, machines can produce original, coherent, academically formatted text on any subject without any human intelligence being directly involved in its creation.
The term “democratization” captures something essential about this transformation, though it carries uncomfortable implications when applied to academic dishonesty. What once required significant resources—money for essay mills, time for research, skill for effective plagiarism—now requires almost nothing. A student with a smartphone and a free ChatGPT account can generate a competent essay on virtually any topic in under a minute. The financial barrier that had limited contract cheating to affluent students vanishes. The time investment that made plagiarism laborious disappears. And the skill requirement that at least ensured some engagement with the material evaporates. Every student, regardless of economic background, language proficiency, or academic preparation, now has access to an infinitely patient, remarkably capable writing assistant that can produce passable academic work on demand.
Consider the concrete reality of what this means for a typical undergraduate assignment. A student asked to write a five-page paper analyzing the causes of the French Revolution no longer faces the traditional challenges that such an assignment was designed to address. They don’t need to locate sources—the AI has been trained on thousands of texts about the French Revolution. They don’t need to organize their thoughts—the AI produces perfectly structured essays with clear introductions, body paragraphs, and conclusions. They don’t need to struggle with transitions or topic sentences—the AI handles these mechanical aspects of writing flawlessly. And they don’t even need to understand the topic—they can simply paste the assignment prompt into ChatGPT and receive a complete essay that addresses all the required elements.
The speed of this process defies comprehension for those accustomed to traditional academic work. What once took days or weeks now takes seconds. A student can generate multiple versions of an essay, each with different arguments and evidence, in less time than it would traditionally take to write an outline. They can request revisions, ask for additional paragraphs, demand different stylistic approaches. The AI never tires, never complains, never judges. It’s available at three in the morning the night before the deadline, ready to produce polished prose on demand. The psychological barriers that once prevented many students from cheating—shame, fear, complexity—dissolve in the face of such frictionless capability.
But to focus only on the speed and ease of AI-generated text is to miss the more profound transformation. The shift from a service-based to a self-service model of academic dishonesty fundamentally changes the student’s relationship to their own education. In the old model of contract cheating, the student was essentially a passive consumer. They paid someone else to do their work and received a product in return. The transaction was clear, the ethical violation obvious. The student knew they were cheating because they were explicitly outsourcing their intellectual labor to another human being.
With generative AI, the relationship becomes far more ambiguous and psychologically complex. The student is not passive but active, not consuming but creating—or at least, appearing to create. They craft the prompts, guide the AI’s output, select among variations, perhaps edit and refine the final product. They might spend hours working with the AI, feeling as though they are engaged in genuine intellectual labor. The line between tool use and substitution blurs beyond recognition. Many students report feeling that they are “collaborating” with the AI rather than cheating, that they are using a sophisticated tool rather than avoiding work. This psychological ambiguity makes AI use far more appealing and defensible to students who would never have considered traditional forms of cheating.
The sophistication of AI-generated text presents challenges that previous forms of academic dishonesty never posed. Unlike plagiarized text or purchased essays that often exhibited telltale signs, AI-generated work is designed to be original and stylistically consistent, making definitive proof of authorship nearly impossible without reliable technical tools. We will examine the technical and ethical failures of these detection systems in detail in the next chapter.
The educational response to this disruption has been characterized by what can only be described as institutional panic. The surveys and reports from late 2022 and early 2023 paint a picture of a system in crisis. Teachers reported that cheating had become “off the charts,” describing it as the “worst they’ve seen” in careers spanning decades. Some estimated that a majority of submitted work showed signs of AI assistance. Others threw up their hands entirely, declaring take-home assignments dead and reverting to in-class, handwritten assessments. The carefully constructed edifice of modern education, built on the assumption that students would do their own work, crumbled in real-time.
What makes this panic particularly acute is the comprehensive nature of AI’s capabilities. Previous disruptions affected certain types of assignments while leaving others untouched. Plagiarism was mainly a problem for research papers. Contract cheating worked poorly for creative or personal writing. But generative AI excels across virtually every genre of academic writing. It can produce research papers, literary analyses, personal narratives, creative fiction, technical reports, even poetry and dialogue. No form of written assessment remains immune. The technology that educators had relied upon to escape previous forms of cheating—unusual prompts, personal reflection, creative assignments—all fall equally before AI’s capabilities.
The quantitative leap in scale deserves emphasis. If we accept conservative estimates that traditional contract cheating affected perhaps 3-5% of submitted assignments, the potential scope of AI use represents an increase of an order of magnitude or more. Surveys conducted in 2023 and 2024 found that 30-50% of students admitted to using AI for academic work, with many more likely hiding their use. But even these shocking numbers may underestimate the transformation. Contract cheating was binary; either you purchased a paper or you didn’t. AI use exists on a spectrum from minor assistance to complete substitution. A student might use AI to overcome writer’s block, to polish sentences, to generate ideas, to write entire sections, or to produce complete assignments. Each represents a different degree of substitution, but all undermine the traditional model of individual, unassisted academic work.
The business model transformation from service-based to self-service deserves particular attention for what it reveals about the nature of this disruption. The traditional cheating economy operated on a scarcity model. There were a limited number of capable writers, each could only produce so much work, and their time was valuable. This scarcity created natural limits on the system’s capacity and kept prices high enough to exclude many students. AI operates on an abundance model. The marginal cost of generating additional text is essentially zero. One AI system can simultaneously serve millions of students, producing unlimited variations on any topic, available instantly at minimal or no cost.
This abundance doesn’t just change the economics of cheating; it transforms its sociology. When academic dishonesty required significant money, it reinforced existing inequalities, allowing wealthy students to buy advantages their peers couldn’t afford. AI’s low cost theoretically democratizes these advantages, making them available to all. But this democratization is itself deeply problematic. If everyone can generate competent academic writing with equal ease, then traditional assessment becomes meaningless as a measure of individual capability. The very foundations of meritocratic evaluation, already shaky, collapse entirely.
The comparison with previous technological disruptions in other industries illuminates what’s happening in education. When digital music sharing destroyed the traditional recording industry’s business model, when streaming services eliminated video rental stores, when ride-sharing apps disrupted traditional taxi services, the pattern was similar: new technology eliminated friction, democratized access, and rendered existing business models obsolete virtually overnight. Education is experiencing its Napster moment, its Netflix disruption, its Uber transformation. The difference is that education’s “product”—learning, knowledge, intellectual development—cannot be as easily digitized and distributed as music or movies.
The psychological impact on students deserves careful consideration. Many report a kind of learned helplessness in the face of AI’s capabilities. Why struggle to write when a machine can do it better? Why develop skills that seem obsolete? Why invest effort in work that others are completing with a few keystrokes? This demoralization extends beyond individual assignments to fundamental questions about the value and purpose of education itself. If machines can write, analyze, and argue as well as or better than humans, what is the point of learning to do these things?
For educators, the psychological toll is equally severe. Many report feeling that their entire professional practice has been invalidated overnight. The assignments they’ve refined over years of teaching, the skills they’ve dedicated their careers to developing in students, the standards they’ve used to evaluate work, all seem suddenly obsolete. The social contract between teacher and student, already strained by previous disruptions, feels completely broken. How can one teach writing when students can generate perfect prose without learning to write? How can one assign essays when there’s no way to verify their authorship? How can one maintain academic standards when the very concept of individual academic work has ceased to be meaningful?
The international dimension adds another layer of complexity. While ChatGPT launched globally, different educational systems have responded in radically different ways. Some countries immediately banned AI tools in educational settings. Others embraced them as learning aids. This creates a situation where students in different parts of the world, sometimes taking the same online courses or competing for the same opportunities, operate under completely different rules and expectations. A student in Singapore might be prohibited from using AI while their counterpart in Sweden is encouraged to do so. The globalization of education collides with the localization of AI policy, creating confusion and inequity.
The proliferation of AI models beyond ChatGPT compounds the challenge. By 2024, students could choose among dozens of sophisticated language models—Claude, Gemini, Llama, and countless others—each with different capabilities and characteristics. This diversity makes detection even more difficult, as tools trained to identify text from one model may fail to recognize output from another. It also creates a kind of optimization problem for sophisticated cheaters, who can select the model least likely to be detected for any given assignment. The AI landscape evolves so rapidly that any institutional response is obsolete before it can be fully implemented.
As we confront the collapse of the traditional assessment system, the metaphor of the castle built on sand takes on new meaning. It wasn’t that the castle was poorly constructed or that its builders were naive. Rather, the very ground on which it stood, the assumption that producing academic text required human intelligence and effort, has liquefied. The tide hasn’t just come in; the sea level has risen catastrophically, submerging landmarks we used to navigate by. The question facing education is not how to rebuild the same castle on the same sand, but whether castles and sand are even the right metaphors anymore. The democratization of text generation through AI hasn’t just accelerated academic dishonesty; it has revealed that our entire model of education was built on assumptions that no longer hold. The reckoning that the pre-digital cheating economy foreshadowed and the internet plagiarism panic postponed has finally arrived, and it cannot be addressed through detection software or honor codes or any amount of institutional hand-wringing. It requires nothing less than a fundamental reimagining of what education is for and how we might achieve it in a world where machines can think, or at least appear to think, as well as humans.
The Illusion of Control: Selling Certainty in an Uncertain World
The email arrived at 11:47 PM on a Tuesday. Daphney, a sophomore majoring in public health, was reviewing notes for an upcoming exam when the notification appeared: “Academic Integrity Concern - Immediate Response Required.” Her stomach dropped before she even opened it. The message was clinical, bureaucratic in its efficiency: Turnitin’s AI detection system had flagged her recent essay on epidemiological methods as “87% likely AI-generated.” She was to report to the Academic Standards Office within 48 hours. Failure to appear would be considered an admission of guilt.
Daphney’s case, while fictional, is not uncommon. Across the country, in dorm rooms and library study spaces, similar scenes are playing out with increasing frequency. A high school student might discover that the historical analysis she had labored over for weeks, complete with handwritten outlines and multiple drafts carefully preserved in Google Docs, had been condemned by an algorithm as artificial. Or a graduate student in engineering could find his theoretical framework chapter, the product of months of reading and synthesis, dismissed as machine-generated. It always follows the same pattern: accusation by algorithm, presumption of guilt, and a desperate scramble to prove human authorship of one’s own thoughts.
These are symptoms of a larger institutional response to the ChatGPT revolution that can best be understood as a form of organizational panic. When OpenAI released its large language model to the public in November 2022, educational institutions worldwide experienced what psychological researchers would recognize as an acute stress response. The familiar became foreign overnight. Assessment practices refined over decades suddenly seemed as obsolete as teaching students to use a slide rule. University administrators, college presidents, and school principals found themselves in emergency meetings, trying to formulate responses to a technology that many of them barely understood.
The rapidity of ChatGPT’s adoption intensified this institutional anxiety. By January 2023, barely two months after its release, surveys indicated that significant percentages of students had already used the tool for academic work. There was no gradual adoption curve, no time for careful policy development or measured response. The technology achieved ubiquity faster than institutions could schedule committee meetings to discuss it. This speed created a temporal mismatch between the pace of technological change and the deliberate, often glacial pace of academic governance. Universities that prided themselves on careful deliberation and shared governance found themselves needing to make immediate decisions with profound implications for teaching and learning.
Into this chaos stepped the AI detection companies, bearing promises of salvation. Their marketing materials from this period read like emergency medical advertisements: “Restore Academic Integrity,” “Protect Your Institution,” “Ensure Authentic Student Work.” They positioned themselves not as software vendors but as guardians of educational values, defenders of the academic faith. The product they were selling transcended mere technology; they offered to restore the fundamental assumption upon which educational assessment rested—that student work was actually created by students.
The psychological appeal of these solutions becomes clearer when we understand them through the lens of control theory. Institutions experiencing the acute stress of disruption desperately seek to reestablish a sense of agency and predictability. The detection software promised to transform an ambiguous, anxiety-inducing situation—not knowing whether student work was authentic—into a manageable technical problem with a technical solution. The software would provide certainty: a percentage, a score, a seemingly objective measure that could support administrative decisions. In a world suddenly full of unknowns, the detectors offered the comfort of quantification.
Consider the position of a university administrator in early 2023. Parents were calling, demanding to know what the institution was doing about AI cheating. Faculty were reporting their suspicions about student work but lacking any way to verify them. The media was full of dire predictions about the end of education as we knew it. Board members and trustees were asking pointed questions about the value of degrees that might be earned through artificial means. In this context, the ability to announce that the institution had partnered with a leading AI detection company provided immediate relief. It was action, visible and decisive. It demonstrated that the institution was not passive in the face of disruption but was actively defending academic integrity.
The detection companies understood this psychological dynamic perfectly and designed their sales strategies accordingly. They didn’t sell to individual faculty members, who might ask difficult questions about accuracy and false positives. Instead, they targeted senior administrators—provosts, vice presidents, deans—who were feeling the institutional pressure most acutely. The sales pitch emphasized not technical specifications but emotional reassurance. The technology would “protect your institution’s reputation,” “maintain the value of your degrees,” and “support your faculty.” These were not statements about what the software could do but about what it would mean.
The pricing structures of these services revealed their true nature as institutional security blankets rather than functional solutions. Turnitin, for instance, maintained its practice of opaque, customized pricing that required lengthy negotiations with sales teams. The costs could run into hundreds of thousands of dollars annually for large institutions. Yet the specifics of what was being purchased often remained unclear. The contracts were typically multi-year commitments, locking institutions into relationships that would be difficult and expensive to terminate. This wasn’t the pricing model of a confident company selling a proven solution; it was the approach of an industry that understood its true product was not detection but the appearance of detection.
The actual deployment of these systems created new bureaucratic structures within educational institutions. “Academic Integrity Offices” expanded their staff and scope. New administrative positions were created: AI Compliance Officers, Detection System Administrators, Academic Authenticity Coordinators. Training sessions were organized to teach faculty how to interpret detection reports. Appeals processes were established for students who wished to challenge algorithmic accusations. Each of these developments represented a further investment in the surveillance model, making it increasingly difficult for institutions to acknowledge that the underlying technology might be fundamentally flawed.
At the same time the software companies carefully crafted their communications to maintain plausible deniability about any failures. Their terms of service and public statements emphasized that the detection scores were “indicators” rather than definitive proof, that human judgment should always be involved in any academic integrity decision. Yet their marketing materials told a different story, featuring bold claims about accuracy rates and depicting the software as a reliable guardian against AI-assisted cheating. This doublespeak allowed them to sell certainty while legally disclaiming responsibility for that certainty’s failures.
Faculty members found themselves in impossible positions. Many were skeptical of the detection software but felt pressure—sometimes explicit, sometimes implicit—to use it. Department chairs would note that other sections of the same course were using detection, creating an implied expectation. Students began to expect that their work would be scanned, sometimes specifically requesting it to prove their innocence. The technology that was supposed to support educators instead conscripted them into a system they neither trusted nor wanted.
The psychological impact extended beyond individual cases to poison the entire educational atmosphere. Students began writing defensively, second-guessing every sentence for fear it might trigger an algorithmic false positive. They over-cited sources, avoided common phrases, and developed elaborate documentation practices to prove their authenticity. The surveillance system didn’t just detect cheating; it transformed all students into suspects who needed to constantly prove their innocence.
This transformation represented a fundamental shift in the pedagogical relationship. The traditional model of education, whatever its flaws, was built on a foundation of trust between teacher and student. The teacher trusted that student work represented genuine effort and thought; the student trusted that their work would be evaluated fairly based on its merits. The introduction of algorithmic surveillance shattered this reciprocal trust. Teachers became investigators, students became suspects, and education became a process of verification rather than discovery.
The institutional investment in detection software also created a powerful form of path dependency. Once a university had paid for a site license, trained its faculty, modified its academic integrity policies, and built its bureaucratic structures around detection, admitting that the technology didn’t work became almost impossible. Too much political capital had been expended, too many resources invested, too many decisions justified based on detection scores. The illusion of control became self-reinforcing, with institutions continuing to double down on a failed approach rather than acknowledge error.
The tragedy of this development is not just the individual injustices or the wasted resources, but the missed opportunity for genuine educational innovation. The energy and funds devoted to detection could have been invested in developing new forms of assessment, training faculty in innovative pedagogies, or creating support systems for authentic student learning. Instead, institutions chose to purchase an illusion, to invest in a technological security theater that made everyone feel like something was being done while the actual problem, the obsolescence of traditional assessment in the age of AI, went unaddressed.
As Daphney sat in the Academic Standards Office, presenting her case to a panel of administrators she had never met, showing them drafts and notes and library records, she realized something fundamental had broken in her educational experience. She was no longer a student engaged in learning but a defendant proving her humanity to a machine. The detection software had promised to preserve academic integrity, but in practice, it had transformed education into a process where the default assumption was guilt, where human judgment deferred to algorithmic verdict, where the appearance of control mattered more than the reality of learning. The castle hadn’t been saved; it had been transformed into a prison, with watchtowers that saw threats everywhere but recognized nothing of value.
Inside the Black Box: An Engine for Inequity
To understand the failure of AI detection tools, we must first understand what they claim to do and how they claim to do it. The technology rests on a seemingly logical premise: if artificial intelligence generates text in predictable patterns, then those patterns should be detectable. The companies selling these tools present their systems as sophisticated analytical engines, capable of discerning the subtle fingerprints that distinguish human writing from machine generation. The reality is far more troubling. These systems are not reading for meaning or understanding; they are performing crude statistical analyses that systematically discriminate against vulnerable populations while failing at their basic task of accurate detection.
At the heart of most detection systems are two primary metrics: perplexity and burstiness. Perplexity measures how surprising or unpredictable a piece of text is according to a language model. Writing that uses common words in expected sequences receives a low perplexity score. The sentence “The cat sat on the mat” has very low perplexity; each word follows predictably from the last. In contrast, “The cat contemplated existential dread while perched on the mat” has higher perplexity because the word choices are less statistically predictable. Detection systems flag low-perplexity text as potentially AI-generated, operating on the assumption that machines default to the most probable word sequences.
Burstiness refers to the variation in sentence length and structure within a piece of writing. Human writers naturally vary their expression, mixing short, punchy sentences with longer, more complex constructions. We might write: “The experiment failed. Despite months of careful preparation and repeated trials using different parameters, the hypothesized reaction simply refused to occur under any conditions we could create in our laboratory.” This variation creates high burstiness. AI-generated text, the theory goes, tends toward more uniform sentence structures, producing a steady, consistent rhythm that lacks this natural variation.
These metrics might seem reasonable in theory, but their application reveals fundamental flaws that render them not just ineffective but actively harmful. The problem begins with the assumption that low perplexity and low burstiness are reliable indicators of artificial origin. In reality, these are characteristics of clear, straightforward writing, exactly what many students are taught to produce. Academic writing guides emphasize clarity, concision, and consistency. Students are instructed to avoid unnecessarily complex vocabulary, to maintain consistent paragraph structures, to write clear topic sentences followed by supporting evidence. In other words, they are explicitly trained to produce the kind of writing that detection algorithms flag as artificial.
The empirical evidence of detection failure is overwhelming and comes from multiple independent sources. When OpenAI, the creator of ChatGPT, released its own detection tool in early 2023, it came with significant fanfare. Here was a detector created by the very company that built the most widely used language model; surely it would be effective. Within six months, OpenAI quietly discontinued the tool, citing its “low accuracy rate.” The company that understood AI text generation better than anyone admitted it couldn’t reliably distinguish its own AI’s output from human writing.
Independent research has confirmed this fundamental unreliability. One comprehensive study that tested different detection tools reached a damning conclusion: the tools were neither accurate nor dependable. Some tools achieved an accuracy rate of approximately 50%, statistically equivalent to flipping a coin. Another study found that detection accuracy plummeted when faced with text that had been even lightly edited or paraphrased. Simply running AI-generated text through a basic paraphrasing tool or translating it to another language and back was sufficient to fool most detectors.
Yet even these dismal accuracy figures don’t capture the full scope of the problem. The companies marketing these tools often tout impressive-sounding statistics but these claims are deeply misleading. They typically refer to performance under ideal laboratory conditions, testing pure AI output against pure human writing. In the real world, where students might use AI for inspiration, edit AI-generated content, or blend AI assistance with their own writing, these accuracy claims collapse. Moreover, the companies often emphasize their success at detecting true AI content while downplaying or ignoring their false positive rates, the frequency with which they incorrectly flag human writing as artificial.
The technical limitations of detection become even more apparent when we consider the rapidly evolving landscape of AI models. Most detectors are trained primarily on output from specific versions of specific models, particularly GPT-3.5 and GPT-4. But the AI ecosystem now includes dozens of sophisticated language models—Claude, Gemini, Llama, Mistral, and countless others, each with its own patterns and characteristics. A detector trained to recognize ChatGPT’s output may completely fail when confronted with text from a different model. This creates what might be called the “specificity trap”: the more precisely a detector is tuned to identify one type of AI output, the worse it performs at generalizing to the broader universe of AI-generated text.
But the most damaging aspect of these detection systems is not their technical failure but their systematic bias. The landmark Stanford study that exposed this bias deserves careful examination for what it reveals about the inherent inequity of algorithmic detection. Researchers tested seven popular detection tools on essays written for the Test of English as a Foreign Language (TOEFL) by non-native English speakers. These were definitively human-written essays, created under controlled conditions where cheating was impossible. The results were shocking: on average, the detectors incorrectly flagged 61% of these essays as AI-generated. One widely-used detector flagged 98% of them, nearly every single essay written by a non-native speaker was condemned as artificial.
The mechanism of this bias is insidious. Non-native English speakers often write with certain predictable patterns. They tend to use a more limited vocabulary, favoring common words they know to be correct over more varied or sophisticated alternatives. Their sentence structures often follow consistent patterns learned from textbooks and language instruction. And they may avoid idiomatic expressions or complex grammatical constructions where they feel less confident. All of these characteristics, which are natural results of writing in a second language, produce text with lower perplexity and lower burstiness, exactly the features that detection algorithms interpret as evidence of AI generation.
The bias extends beyond language background to affect other vulnerable populations in different but equally damaging ways. Students with autism spectrum disorders often exhibit writing patterns that trigger false positives. They may use repetitive phrases or structures that feel comfortable and predictable to them. Their writing might lack the emotional variation or personal voice that detection algorithms associate with human authorship. A student with autism who writes with precise, consistent logic, one of their potential strengths, finds that very precision used as evidence against them.
Students with ADHD face a different but related challenge. Their writing might be highly variable—bursting with energy in some sections, more methodical in others—in ways that confuse detection algorithms trained on neurotypical patterns. Alternatively, students using medication to manage ADHD might produce unusually consistent, focused writing that lacks the natural digressions and variations the algorithms expect. Either way, their neurodivergence becomes a liability in the face of algorithmic judgment.
The interaction between detection systems and assistive technologies creates another layer of discrimination. Many students with learning disabilities rely on tools like Grammarly, which uses AI to suggest improvements to grammar, clarity, and style. These tools are often not just permitted but actively recommended by university disability services offices. They represent reasonable accommodations that level the playing field for students with dyslexia, dysgraphia, or other learning differences. Yet when students use these legitimate tools, the changes they make to the text, smoothing out grammatical errors, improving sentence flow, clarifying expression, can trigger detection algorithms. The student faces an impossible choice: forgo the assistive technology they need and submit work that doesn’t represent their true capabilities, or use the tools and risk being flagged for academic dishonesty.
The Stanford study revealed an even more perverse outcome. When researchers used ChatGPT itself to “improve” the TOEFL essays—to make them sound more like native English writing—the detection rates reversed. Essays that had been flagged as AI-generated when written by non-native speakers were suddenly classified as human-written once they had been processed by actual AI. This exposes the fundamental incoherence of what these systems are measuring. They are not detecting AI use but enforcing a narrow, culturally specific notion of what “authentic” human writing should look like.
This enforcement of linguistic conformity has implications that extend far beyond individual false accusations. The detectors are effectively defining a standard of “human” writing that excludes vast populations of actual humans. They privilege certain types of expression—complex, varied, idiosyncratic—that are most characteristic of native English speakers with extensive educational privileges. A student who grew up in an English-speaking household, attended well-funded schools, and had access to extensive reading materials will naturally produce writing with high perplexity and burstiness. Their privilege is interpreted as humanity.
Meanwhile, students who are working to overcome linguistic, cultural, or neurological differences find their efforts interpreted as evidence of artificiality. The very characteristics that represent their hard work and growth, clear expression despite language barriers, consistent structure despite processing differences, correct grammar achieved through assistive technology, become markers of suspicion. The detection systems don’t just fail these students; they actively punish them for the ways they differ from an implied norm.
The concept of an “authenticity tax” captures this systematic disadvantage. While privileged students can write naturally and unselfconsciously, marginalized students must constantly think about how their writing will be perceived by an algorithm. They must expend cognitive and emotional energy not on developing their ideas but on performing a version of humanity that a machine will recognize. Some students report deliberately introducing errors into their writing, adding unnecessary complexity, or avoiding helpful tools, all in an attempt to appear more “human” to the detectors. This is educational energy diverted from learning to the performance of algorithmic compliance.
The companies producing these tools are not unaware of these biases. Their technical documentation often includes disclaimers about potential false positives for certain populations. They recommend that detection scores should be just one factor in academic integrity decisions, that human judgment should always be involved. But these careful disclaimers are overwhelmed by marketing materials that present the tools as objective, scientific solutions to the AI cheating crisis. The companies profit from institutional anxiety while disclaiming responsibility for the discrimination their tools enable.
The opacity of these systems compounds their harm. Unlike traditional plagiarism detection, which provides specific sources and highlighted matches that educators can verify, AI detection operates as a black box. The algorithms are proprietary, their training data secret, their decision-making processes inscrutable. When a student is accused based on a detection score, they cannot meaningfully challenge the basis of that accusation. They cannot point to specific errors in the algorithm’s reasoning because that reasoning is hidden. They are asked to defend themselves against a machine whose logic they cannot access or understand.
This algorithmic authority represents a fundamental abdication of educational responsibility. Teachers who might never make accusations based on their own subjective judgment defer to the supposed objectivity of the machine. A professor might read a student’s essay, find it perfectly reasonable, but initiate misconduct proceedings because a detector returned a high score. The algorithm doesn’t support human judgment; it replaces it. The very faculty who entered education to nurture human potential find themselves enforcing the verdicts of systems that cannot recognize that potential when it appears in unexpected forms.
The cumulative effect of these biases is the creation of a two-tier system of academic surveillance. Students who write in ways that align with algorithmic expectations, primarily native English speakers from privileged backgrounds, can use AI tools with relative impunity, confident that their natural writing patterns will provide camouflage. Students whose writing differs from this norm, non-native speakers, neurodivergent students, those using assistive technologies, face constant scrutiny and suspicion, regardless of whether they have used AI at all. The detection systems don’t create a level playing field; they tilt it further in favor of those already advantaged.
The Pedagogy of Distrust
The transformation of education under algorithmic surveillance extends far beyond technical failures and false accusations. It represents a fundamental restructuring of the pedagogical relationship itself, replacing mentorship with monitoring, curiosity with compliance, and trust with systematic suspicion. This new paradigm doesn’t merely detect violations of academic integrity; it actively undermines the conditions necessary for genuine learning to occur. The deployment of AI detection software has created what can only be described as a pedagogy of distrust, where every interaction between teacher and student is mediated by the specter of algorithmic judgment.
Consider the story of a college teacher, let’s call her Dr. Martinez. She has taught nineteenth-century American literature for fifteen years. Before the advent of AI detection, her course was built around deep engagement with complex texts, encouraging students to develop their own interpretive voices through iterative writing and discussion. Students would submit reading responses, develop thesis statements through peer workshopping, and craft essays that grew from genuine intellectual curiosity. Her marginal comments focused on pushing students to think more deeply, to consider alternative readings, to strengthen their arguments. The relationship was fundamentally collaborative, with Dr. Martinez serving as an experienced guide helping students navigate challenging intellectual terrain.
After her institution mandated the use of detection software, everything changed. Now, before she can engage with a student’s ideas, she must first submit their work to the algorithm and interpret its verdict. A score appears—32% probability of AI generation, 67% probability, 91% probability—and this number shapes everything that follows. Even when the score is low, its very existence transforms the encounter. She finds herself reading with a detective’s eye, looking for signs of inauthenticity rather than engaging with arguments. The question is no longer “What is this student trying to say?” but “Did this student actually say it?”
The role transformation forced upon educators by detection technology is profound and largely involuntary. Teachers who entered the profession to inspire young minds, to share their passion for their subjects, to nurture intellectual growth, find themselves conscripted into a policing role they never sought. They become “AI cops,” spending hours not on curriculum development or student mentoring but on interpreting detection reports, documenting suspicious patterns, and conducting investigation procedures. The time that might have been spent providing meaningful feedback on student ideas is now consumed by the bureaucracy of verification.
This conscription carries a psychological toll that institutional administrators rarely acknowledge. Teachers report feeling caught between conflicting responsibilities. Their pedagogical training tells them to trust and support their students. Their institutional mandate requires them to suspect and surveille. When a detection system flags a student’s work, the teacher faces an impossible choice. Believing the student means potentially enabling academic dishonesty and facing administrative censure. Believing the algorithm means potentially destroying a relationship with an innocent student and participating in an unjust system. There is no winning move, only varying degrees of moral compromise.
The surveillance apparatus also fundamentally alters how teachers design and implement their courses. Assignment creation, once driven by pedagogical goals and student learning outcomes, now centers on detection evasion and verification possibilities. Teachers report spending hours crafting prompts they hope will be “AI-proof,” not because these assignments are pedagogically superior but because they might reduce the likelihood of cheating or false accusations. The question “What will best help students learn?” is replaced by “What can I assign that the detection software won’t flag?”
This defensive pedagogy extends to every aspect of course design. Syllabi that once opened with welcoming statements about intellectual exploration now begin with lengthy warnings about AI use and detection policies. First-day discussions that might have focused on course content instead center on academic integrity policies and the consequences of algorithmic accusation. Office hours that could be spent discussing ideas are consumed by students seeking pre-emptive clearance that their writing won’t be flagged. The entire educational enterprise reorganizes itself around the threat of detection rather than the promise of learning.
Students, recognizing the new reality, adapt their behavior in ways that further corrode educational values. The phenomenon of defensive writing deserves particular attention. Knowing that their work will be subjected to algorithmic scrutiny, students begin writing not to express their ideas clearly but to avoid triggering detection. They consciously vary their sentence structures, not because it improves their arguments but because consistent patterns might be flagged as artificial. They introduce unnecessary complexity, use convoluted phrasing, and avoid helpful tools that might smooth their prose. The algorithm intended to preserve authentic writing instead produces a kind of performative authenticity, where students simulate human inconsistency to satisfy a machine’s expectations.
The chilling effect on intellectual risk-avoidance is perhaps the most damaging consequence of the surveillance regime. Education at its best encourages students to experiment with ideas, to try new forms of expression, to push beyond their comfort zones. This necessarily involves the risk of failure, of producing work that doesn’t quite succeed, of making arguments that don’t fully convince. But in an environment where unusual writing might trigger algorithmic suspicion, students retreat to the safest, most conventional forms of expression. They avoid creative structures that might seem artificially polished. They eschew experimental arguments that might appear too sophisticated. They produce exactly the kind of mediocre, predictable work that education should challenge them to transcend.
The surveillance system also destroys the temporal dimension of educational relationships. Learning is not instantaneous but develops over time through repeated interactions, feedback, and growth. Teachers come to know their students’ voices, their strengths and struggles, their patterns of development. This knowledge traditionally informed pedagogical decisions, allowing teachers to provide appropriate challenges and support. But the algorithm knows nothing of this history. It evaluates each piece of writing in isolation, without context, without understanding of the student’s journey. A breakthrough essay that represents enormous growth for a struggling student might be flagged simply because it seems too good relative to some statistical norm.
The erosion of trust extends beyond individual relationships to poison the entire educational atmosphere. Students begin to view their teachers not as mentors but as potential accusers. They approach educational interactions with wariness and calculation rather than openness and curiosity. Study groups that might have collaborated on understanding difficult texts now operate under suspicion that shared insights might trigger plagiarism detection. Peer review sessions become exercises in mutual surveillance rather than collaborative improvement. The social dimension of learning, where ideas develop through discussion and exchange, withers under the assumption that authentic work must be produced in isolation.
Faculty meetings that once focused on curriculum and pedagogy now center on detection protocols and academic integrity procedures. Departments develop elaborate flowcharts for handling flagged assignments. Committees form to adjudicate contested cases. Training sessions teach faculty how to interpret detection scores and conduct investigations. The institutional energy devoted to surveillance grows ever larger, creating its own bureaucratic momentum. Once these structures exist, they must be justified through use. The detection system creates the very problems it claims to solve, generating suspicion that requires investigation that reinforces the need for detection.
The pedagogy of distrust also manifests in the changing nature of academic feedback. Comments that once engaged with ideas now focus on authenticity markers. Instead of “This argument could be strengthened by considering counterevidence,” teachers write “This paragraph seems stylistically different from your usual work.” Instead of “Your analysis of symbolism is insightful,” they note “This level of sophistication is unexpected.” Every piece of feedback becomes potentially accusatory, creating a subtext of suspicion that undermines its pedagogical value. Students learn to read comments not for intellectual guidance but for signs of doubt about their authorship.
The impact on academic confidence is particularly severe for students who are already marginalized. A first-generation college student, unsure whether they belong in higher education, receives algorithmic suspicion of their work and might internalize it as confirmation of their inadequacy. An international student, already navigating cultural and linguistic challenges, might face constant scrutiny that reinforces their sense of otherness. A student with learning disabilities, proud of work produced with approved accommodations, might discover that their success is viewed with suspicion. The surveillance system doesn’t just fail these students; it actively undermines their academic identity formation.
Graduate education suffers unique damage under the surveillance regime. The relationship between graduate students and their advisors traditionally involves deep intellectual collaboration, with ideas developing through extended dialogue and mutual exploration. But when every piece of writing must be verified as authentic, this collaboration becomes suspect. A graduate student who discusses ideas extensively with their advisor, incorporates feedback, and produces polished work might find that very polish used as evidence against them. The algorithm cannot distinguish between inappropriate assistance and appropriate mentorship, so the safer path becomes intellectual isolation.
The surveillance mindset spreads beyond formal assessment to infect all educational writing. Email communications with professors become sites of potential scrutiny. Discussion board posts are evaluated for authenticity markers. Even informal reflections and journal entries fall under suspicion. Students report anxiety about any written communication, wondering whether their natural voice will seem artificial to an algorithm. The technology ostensibly deployed to preserve academic writing instead makes students fearful of writing at all.
The long-term consequences of this pedagogy of distrust extend far beyond the classroom. Students educated under surveillance learn that authority is algorithmic rather than human, that compliance matters more than creativity, that the appearance of authenticity is more important than authentic engagement. They enter professional environments trained to perform legitimacy rather than pursue excellence. They have been taught to fear judgment more than seek growth. The hidden curriculum of the surveillance classroom—distrust, defensiveness, and performative compliance—shapes citizens and workers in ways that undermine the very purposes of education.
Teachers who resist this transformation find themselves increasingly isolated. Those who refuse to use detection software face pressure from administrators who view non-compliance as enabling cheating. Those who openly criticize the technology risk being labeled as soft on academic integrity. And those who try to maintain trusting relationships with students must constantly defend their professional judgment against algorithmic verdicts. The system creates powerful incentives for conformity, gradually wearing down resistance through exhaustion and isolation.
The tragedy is that everyone involved recognizes the damage being done. Students know they are not trusted. Teachers know they have become enforcers rather than educators. Administrators know the technology is flawed and the atmosphere is poisoned. Yet the system persists through a combination of institutional inertia, sunk costs, and the lack of visible alternatives. The surveillance apparatus, once installed, becomes self-perpetuating. Dismantling it would require admitting failure, accepting risk, and reimagining assessment from the ground up. Instead, institutions continue to invest in a pedagogy that produces neither learning nor integrity but only the hollow performance of both.
The Unwinnable Arms Race
The technological conflict between AI detection and AI evasion resembles nothing so much as a classical arms race, with each advance in offensive capability spurring corresponding defensive innovations in an endless cycle of measure and countermeasure. But unlike military arms races, which at least theoretically aim toward security through superiority, this educational arms race can have no winner. The fundamental asymmetry between generation and detection, the ease of evasion versus the difficulty of verification, and the economic incentives driving both sides ensure that educational institutions will remain perpetually outmatched. The entire detection enterprise is, in the language of Greek mythology, a Sisyphean task—an endless labor that achieves nothing but its own repetition.
The fragility of detection systems becomes apparent the moment we examine how easily they can be defeated. A student with even basic technical knowledge can bypass most detectors through simple manipulations that would seem almost comical if the stakes weren’t so high. The most straightforward method involves paraphrasing. A student generates an essay using ChatGPT, then runs it through a paraphrasing tool—many of which are free and instantly accessible online. The paraphraser replaces words with synonyms, restructures sentences, and alters syntax just enough to confuse detection algorithms while preserving the essential meaning. What emerges is text that remains fundamentally AI-generated in its conception and structure but appears human to the detector.
Translation offers another trivial bypass method. A student can generate text in English, translate it to another language—say, French or Spanish—and then translate it back to English. The round-trip translation introduces enough linguistic variation to fool most detectors while maintaining the core content. Some students have discovered that even translating through multiple languages in sequence—English to Spanish to Japanese to English—produces text that appears increasingly “human” to algorithms, though often at the cost of clarity and coherence.
More sophisticated evasion techniques exploit the specific weaknesses of detection algorithms. Researchers have demonstrated that inserting invisible characters—Unicode symbols that don’t display but are processed by computers—can cause detection rates to plummet. A zero-width space inserted between each word, invisible to any reader, can reduce a detector’s confidence by 30% or more. Similarly, replacing standard Latin characters with visually identical Cyrillic equivalents—using a Cyrillic ‘a’ instead of a Latin ‘a’—completely confounds many detection systems while producing text that looks identical to human readers.
The very existence of these simple bypasses reveals a fundamental truth about detection technology: it operates at the surface level of textual features rather than engaging with meaning or understanding. The detectors are looking for statistical patterns, not comprehending content. This superficiality makes them inherently vulnerable to even minor perturbations that wouldn’t fool a human reader for a moment. A professor might immediately recognize that an essay’s ideas, structure, and argumentation seem artificially generated, but if the surface features have been sufficiently altered, the detection algorithm sees only authentic human writing.
The emergence of AI “humanizer” tools represents the industrialization of this evasion process. What began as individual students experimenting with workarounds has evolved into a sophisticated commercial ecosystem. Companies like Undetectable AI, Quillbot and Humanize AI have built entire business models around defeating detection systems. Their marketing is remarkably brazen. Undetectable AI’s website explicitly names the detection systems it can defeat: “Bypass Turnitin, GPTZero, Copyleaks, and more!” The company offers tiered subscription plans based on word count, with prices low enough to be accessible to any student with a credit card or PayPal account.
These humanizer tools don’t simply paraphrase or translate. They employ sophisticated techniques specifically designed to increase perplexity and burstiness, the very metrics that detectors use to identify AI text. They insert intentional variations in sentence structure, add idiosyncratic word choices, introduce minor grammatical imperfections that suggest human fallibility. Some even analyze the specific detection algorithms used by major companies and optimize their humanization process to defeat those particular systems. The result is an evolutionary arms race conducted at machine speed, with each iteration of detection spurring corresponding innovations in evasion.
The business model of companies like Quillbot reveals the cynical economics underlying this conflict. Quillbot offers a comprehensive suite of AI-powered writing tools: a paraphraser, a grammar checker, a summarizer, a citation generator, and—crucially—both an AI detector and an AI humanizer. This positioning is brilliantly cynical. The same company sells both the sword and the shield, profiting from every transaction in the conflict. A student can use Quillbot’s detector to check if their AI-generated text would be flagged, then use Quillbot’s humanizer to alter it until it passes, all within the same platform, often within the same subscription.
This “arms dealer” model extends throughout the industry. Companies that began as paraphrasing tools for legitimate purposes, helping non-native speakers improve their writing, assisting students with learning disabilities, have pivoted to explicitly market their ability to defeat AI detection. They maintain plausible deniability by including terms of service that prohibit academic dishonesty, while their marketing materials make their true purpose crystal clear. The hypocrisy is transparent but legally protected, allowing these companies to profit from the very crisis that detection companies claim to be solving.
The economic dynamics of this arms race decisively favor the evaders. Detection companies must invest heavily in research and development, constantly updating their algorithms to recognize new patterns of AI generation and new methods of evasion. They need substantial computational resources to process millions of documents, sophisticated machine learning infrastructure, and teams of engineers and data scientists. Their costs are high and ongoing. Evasion companies, by contrast, can often achieve their goals through relatively simple transformations. The marginal cost of humanizing a piece of text is negligible—a few seconds of computation that costs fractions of a penny. This economic asymmetry means that evasion will always be cheaper than detection, making it accessible to more users and more profitable for providers.
The proliferation of AI models compounds the detection challenge exponentially. When there was essentially only ChatGPT to worry about, detection companies could at least focus their efforts on a single target. Now the landscape includes Claude, Gemini, Llama, Mistral, Cohere, and dozens of other sophisticated language models, each with different training data, different architectures, and different output characteristics. Open-source models can be fine-tuned by anyone with modest technical skills, creating infinite variations that have never been seen by detection systems. A student can download Llama, fine-tune it on their own previous writing to match their style, and produce text that no detector has been trained to recognize.
The speed of model development outpaces detection capabilities by orders of magnitude. By the time a detection company has gathered training data from a new model, analyzed its patterns, and updated their algorithms, newer models have already been released. The detection systems are always fighting the last war, optimizing for yesterday’s threats while remaining blind to today’s. This temporal lag is not a temporary problem that will be solved with better technology; it’s a fundamental constraint arising from the difference between creating and detecting. Creation can be instantaneous and novel; detection requires pattern recognition based on prior examples.
Some students have discovered that the most effective evasion strategy requires no technology at all—just patience and iteration. They generate multiple versions of an essay using different AI models, then manually combine elements from each, selecting a paragraph from ChatGPT, a transition from Claude, a conclusion from Gemini. The resulting patchwork maintains enough stylistic variation to confuse detectors while benefiting from the sophisticated capabilities of multiple AI systems. This method, sometimes called “AI collaging,” produces text that appears human not because it has been artificially humanized but because it reflects the kind of inconsistency that comes from actual human editing and revision.
The detection companies’ response to these evasion techniques reveals their fundamental powerlessness. They issue updates claiming to detect humanized text, but these updates are quickly defeated by the next iteration of humanizer tools. They add new metrics beyond perplexity and burstiness, but these metrics prove equally vulnerable to manipulation. They claim to identify patterns that are “definitionally” AI-generated, but these definitive patterns turn out to be statistical tendencies that can be easily avoided. Each announced improvement in detection is followed, usually within weeks, by corresponding improvements in evasion.
The academic consequences of this arms race extend far beyond simple cat-and-mouse games. Every iteration makes the technology less reliable and more biased. As detection algorithms become more sensitive to catch sophisticated evasion, they also become more likely to flag legitimate human writing that happens to share some statistical features with AI text. As humanizer tools become better at mimicking human inconsistency, they teach detection systems to be suspicious of the very features that indicate authentic human expression. The arms race doesn’t lead to better detection; it leads to more false positives and an ever-expanding definition of what counts as “suspicious” writing.
Educational institutions find themselves trapped as collateral damage in this technological conflict. They invest millions in detection software that becomes obsolete almost immediately. They build policies and procedures around technological capabilities that evaporate with each update. They train faculty on systems that change faster than training materials can be updated. Meanwhile, students with technical knowledge or financial resources to access humanizer tools operate with impunity, while those without such advantages, often the most vulnerable students, face the full force of unreliable detection systems.
The psychological impact of this arms race on students and educators cannot be overstated. Students experience the detection system as an arbitrary and capricious force, seemingly random in its verdicts. They see peers using AI successfully while others are falsely accused. This arbitrariness breeds not respect for academic integrity but cynicism about the entire enterprise. If the system can’t reliably distinguish between authentic and artificial work, why should students take it seriously? The arms race doesn’t preserve academic values; it teaches students that education is a game where winning depends on technical sophistication rather than intellectual engagement.
Educators, meanwhile, experience a different form of demoralization. They know the detection tools don’t work reliably. They see obvious AI-generated text that passes detection while authentic student writing gets flagged. They’re forced to make high-stakes decisions based on technology they don’t trust, knowing that whatever they decide will be wrong in some cases. The arms race positions them as unwilling participants in a conflict they didn’t choose and can’t win, transforming their professional practice into an exercise in technological futility.
The truly insidious aspect of this arms race is its self-perpetuating nature. The existence of detection creates demand for evasion, which creates demand for better detection, which creates demand for better evasion, ad infinitum. Both industries—detection and evasion—have a vested interest in the continuation of the conflict. Peace would destroy both business models. So the arms race continues, consuming ever more resources, generating ever more sophisticated tools, while the actual problem how to maintain academic integrity in an age of artificial intelligence remains not just unsolved but unaddressed.
The companies involved understand this dynamic perfectly. Their business plans don’t envision victory but perpetual conflict. Venture capital flows to both sides, betting not on resolution but on escalation. Academic conferences feature panels on “next-generation detection” and “AI-resistant assessment,” but these discussions operate within the assumption that the arms race is permanent. The very framing of the problem as a technical challenge requiring technical solutions ensures that it can never be solved, only managed through ever-increasing investments in surveillance and counter-surveillance.
As we survey this technological battlefield, littered with broken detection systems and triumphant evasion tools, the Sisyphean nature of the enterprise becomes undeniable. Educational institutions are pushing the boulder of detection up the mountain, only to watch it roll back down each time a new evasion technique emerges. The effort is not just futile but actively harmful, diverting resources from education to surveillance, replacing trust with suspicion, and teaching students that gaming the system matters more than genuine learning. The arms race has no winner except the companies profiting from its continuation. For education itself, there is only exhaustion, cynicism, and the gradual erosion of everything that makes learning meaningful.
Thank you for following Part 1 of this journey to its conclusion. If this chapter resonated with you, I hope you’ll join me as we pivot from problem to solution.
Next Saturday marks the beginning of Part 2 with Chapter 4, ‘Making Thinking Visible.’ Having established that currently trusted assessment methods have become unreliable, we will start to address a more fundamental question: How do we make the process of student thinking visible, and in doing so, learn to assess what AI can never replicate?
P.S. I believe transparency builds the trust that AI detection systems fail to enforce. That’s why I’ve published an ethics and AI disclosure statement, which outlines how I integrate AI tools into my intellectual work.


