Checkmark Plagiarism Logo
Checkmark Plagiarism
Back to Blog
Technology~7 min read

False Positives and Academic Integrity: The Risk of Current AI Detection Tools — What Happens When AI Detection Fails?

When detectors get it wrong, trust breaks. We explain misfires, the human cost, and a path forward using transparent process data and essay playback.

The Checkmark Plagiarism Team
False Positives and Academic Integrity: The Risk of Current AI Detection Tools — What Happens When AI Detection Fails?
Published onCheckmark Plagiarism

“Where will things be in two year and a half years? And how do you prepare students for that world that's rapidly evolving?” — Eddie Watson (Teaching in Higher Ed, Ep. 517)

AI is quickly becoming an unwelcome guest for teachers in educational institutes across the country. As students use AI more to complete assignments and essays, teachers have been saddled with the time consuming task of weeding out generated text from hundreds of assignments. As this new chore was placed on teachers shoulders, AI detectors emerged to lessen the load. What happens when AI detectors fail though? Are current AI detectors reliable?

Misfires and Why They Occur

In a 2024 study done by Steven R. Cooperman and Roberto A. Brandão on the accuracy and frequency of false positives using various AI detectors such as GPT Zero and Copyleaks, it was found that after six rounds of testing scientific texts and retesting three months afterwards, the average accuracy concluded was 64%. That means that 36% of the time, the AI detectors were incorrect in their evaluation of a piece of text. That is astonishingly high considering the life-altering effects a false detection can have on a professional or educational career.

Why is this? Tools such as Turnitin and GPT Zero both use pattern recognition, predictability, and intermittency for their detection methods. However, these things have proven themselves to be easily spoofed and bypassed by students, even more so as AI continues to grow and students become more familiar with how to use it. Though it is now more accepted that AI detectors do have a margin of error, this is an issue that companies still race to fix, for the lasting effects of a false detection can be catastrophic.

Even OpenAI themselves, in response to the question as to if AI current detectors work, said:

“In short, not in our experience. Our research into detectors didn't show them to be reliable enough given that educators could be making judgments about students with potentially lasting consequences.”

So, where does that leave everyone?

Consequences and Broken Trust: The Real-World Impact of False Positives

Being falsely accused of using AI to complete an assignment can be an incredibly demoralizing experience for a student. With their integrity being questioned, a false positive can make students who’ve poured hours into writing, researching, and editing their own work, feel like their work no longer matters. It breeds a sense of mistrust and anxiety, causing them to wonder if their future efforts will be flagged again, or worse, if they’ll be punished unfairly. This is not unheard of, unfortunately, with many students having shared experiences where they were accused of using AI unjustly and sent down the lengthy and exhausting path of having to plead their case in an integrity investigation.

Teachers, on the other hand, are often placed in an uncomfortable and difficult position when a detector identifies a student’s work as AI written. They may feel pressure to act, even if they personally believe in the student’s honesty, which can cause oversight that could’ve been avoided. This tension puts strain on the very teacher-student relationships that build the foundation of a classroom environment. Then there’s the guilt that may come after the realization that a student didn’t use AI. These situations can sow paranoia and suspicion, leading to a hesitance towards discussion in a place where transparency and communication should be encouraged.

For the administrators and educators who oversee academic integrity policies, this is also a serious issue. They want to uphold academic standards and help students build their critical thinking skills, but they also must ensure that innocent students aren’t being penalized by faulty systems. Ultimately, if we rely too heavily on the tools we currently have, we risk replacing thoughtful human judgment with flawed automation—serving no one.

graph showing the difference with using ai in the classroom for school work

Solutions and a New Path in Detection

This all begs the question, what can we do moving forward to make sure that we safeguard the trust between students and teachers, while not normalizing the use of AI in the classroom?

In order to improve the situation, the simplest answer would be to expand on the methods used to detect AI in written assignments. Checkmark Plagiarism introduces a new method through their patent-pending keystroke analysis software. Keystroke analysis is the process of taking data on human writing patterns and using it to identify inconsistencies in the writing stream of a student’s work. Instead of only relying on text patterns and burstiness, Checkmark Plagiarism breaks down the writing process itself for teachers to reference while grading. This provides teachers and admin with the opportunity to observe for themselves how the student in question interacts with the keyboard as they compose their assignment through a playback system. This creates an environment of transparency for both the student and the teacher.

This new method encourages process writing, the act of drafting, revising, and outlining. Process writing is closely linked with and promotes critical thinking skills, which often are the thing that suffer the most when AI is involved in the writing process. According to the Purdue Global Writing Center, the writing process is:

“not just a mirror image of the thinking process: it is the thinking process. Confronted with a topic, an effective critical thinker/writer: asks questions, seeks answers, evaluates evidence, questions assumptions, tests hypotheses, makes inferences, employs logic, draws conclusions, predicts readers’ responses, creates order, drafts content, seeks others’ responses, weighs feedback, criticizes their own work, revises content and structure, seeks clarity and coherence”

In order to help students go forward into the world with a toolbox of their own skills at their disposal, teachers, parents, and educators should encourage their students to explore process writing and in turn, show their educators what they’re truly made of through Checkmark Plagiarism.

Written by The Checkmark Plagiarism Team.

False Positives and Academic Integrity: The Risk of Current AI Detection Tools