What are "False-Positives" in AI Writing Detection?

A false positive is when a detector flags content as “AI” when it was actually written by a human. This occurs when human-written content reflects the predictable word choices LLMs use when generating text.

More than 40% of 6th to 12th-grade teachers surveyed claimed to use AI detection tools as part of their grading process.^[1]

False-Positives are Damaging Academics

As of 2026, there is no universal method of detecting AI-written content in students’ work. Most universities use a widely used offline tool known as Turnitin to check coursework for AI and plagiarism.

Turnitin is not available to the general public and is only offered as license-based software to universities, charging between $2-$6 per student^[2] and UC Berkeley signed a 10-year contract with Turnitin valued at $1.2m.^[3]

The main issue is that professors are relying too much on Turnitin, which is not a 100% accurate tool for verifying AI writing.

Why Human-Writing is Flagged as AI?

Using a Grammar Checker. Using tools like Grammarly and Quillbot often does much more than check spelling. Oftentimes, it suggests rephrasing sentences to sound clearer but also more like AI-generated content.
Non-Native Speakers. The use of simple verbiage throughout an essay without much variation will usually be categorized as “AI”. Non-natives are common victims of this due to not knowing the language as extensively as native speakers. A french-born Yale student^[4] was accused of improper AI use in 2025 of something similar.
Low Burstiness. Humans tend to write with varying sentences length, otherwise known as ‘bursty’ vs the LLMs, which write in similar length and rhythm. When humans write in this sort of structure, it is flagged as “AI”.
Lots of Transitional Language. Connecting sentences by using words like “Nevertheless,” “Furthermore,” “In addition,” etc. Content with a higher number of connector words tends to be flagged by detectors.
Structured Formats. Writing that is a how-to guide, listicle, executive summary, or comparison article with a specific structure is often flagged. This is mainly due to many short facts or statements made, which lack a natural tone.

Students “Questionably” Punished for AI

“Madeleine” (nursing student). A nursing student named Madeleine was in her final year as a nursing student when Turnitin flagged 84% of her essay as “AI written.” Because of this, she had to wait 6 months for the investigation to conclude until the charges were eventually dropped against her.^[5]

Student at Nanyang Technological University (NTU). Despite proof via Google Doc revisions (time-lapse of writing process). The student addressed her concerns while sharing that a citation alphabetical sorter has triggered an AI detector even though the 20 citations provided were accurate.^[6]

High School Student Accused of AI Writing. At Eleanor Roosevelt High School in the District of Columbia, Ailsa Ostovitz had her music assignment flagged as AI, and her grade was docked. The teacher claimed it was flagged using an AI detection tool (which they did not disclose) and refused to retract the punishment.^[7]

How to Reduce False-Positives

1. Google Docs Revision History

Click File > Version history > See version history

For academic writing, reviewing the “version history” is growing in popularity as the most common way to detect AI-writing from students (see also the Revisions History Google Extension).^[8]

Teachers will request that students conduct ALL their writing, meaning drafts, revisions, and final reviews, in the same Google Doc so they can review the timestamps and verify the work that went into the assignment.

Human Writing (sample)

AI-Generated Writing (sample)

Users submit a link, not a file. When requiring Google Docs as the writing/drafting method, the educator will be given a shared link from the student that shows the revision history.

2. Using Multiple AI Detectors

In a study by the American Physiological Society, it was found that using multiple AI detectors when reviewing students’ work reduced false-positives to nearly 0%. The AI detectors used in the study were Copyleaks, DetectGPT, GPTZero, and Originality.ai.

Their testing consisted of 348 total essays (174 AI-generated + 174 human-written), for the purpose of determining the accuracy of using multiple detectors.

Most Accurate Detector

Originality.ai was the most accurate (97.98%)
DetectGPT was the least accurate (69.70%)

3. Ask for a Verbal Conversation (about the subject)

If a teacher is curious about whether a student is using AI-generated content, another approach is to ask them spontaneously to discuss their writing in detail.

Commonly, all that is needed to mention is “what were your sources for these facts?” or “what was your basis for your thinking in this paragraph?” and based on their reaction will see if they truly understand the material.

4. Requiring Citations / Sources

Even though LLMs like ChatGPT and Claude have improved immensely in recent years, they still provide sources to outdated material or 404 pages.

By requiring students to include the source of all facts and where thoughts came from, this can significantly reduce the chance of AI being used in the writing process.

Referencing Page Numbers: It is recommended to require referencing the page number (if a report, study, or other document) or the paragraph section on the webpage mentioned.

5. Redesigning Assignments (based on experiences)

AI writing is great for writing about people, places, and things. But it’s not the best at writing about personal experiences of the student, such as a speaker or a specific course taught in the classroom.

Redesigning courses and assignments for AI has been stressed in school curricula for 2026.^[9] Nearly all students have now had some experience using generative AI tools.

Sources

Comments 0 comments

Sign Up for free

Login

Finish signing up