Academic dishonesty and cheating have always been a concern in the educational sector; it’s one reason that Turnitin has been a leader in the industry since 1998. However, the recent rise in artificial intelligence (AI) has drastically changed the playing field.
Now, rather than simple copy-and-paste acts of plagiarism, students looking for an easy way to cheat can just have ChatGPT write an entire essay for them in seconds.
While an eagle-eyed teacher would be able to easily tell when ChatGPT—or any other chatbot, for that matter—has created a piece of writing, the sheer prevalence of ChatGPT, Claude, and other chatbots like Gemini and Copilot has necessitated reliable ways to detect when a piece of writing has been AI-generated, especially since smart cheaters will use some clever prompt strategies or simple rewrites and rephrases to help differentiate their attempts at AI cheating.
That’s why, in the last few years, tools like GPTZero, AI Detector, OriginalityAI, Turnitin, and others have risen up to try to provide a means of scanning text and telling the user when it’s likely to have been AI-generated. These tools are used not only in education but also in writing at large.
While Turnitin was initially launched in 1998 as a means of detecting plagiarism, in April 2023, the company rolled out a tool tailored for AI content detection. With today’s focus on not only AI but also reliable ways to detect when it’s present, it begs the question of just how accurate Turnitin is.
The short answer is that while Turnitin’s AI detector shows some high levels of accuracy, there are a few caveats that we’ll explore in a bit more detail later. But first, let’s explain Turnitin in a bit more detail.
What Is Turnitin?
For over twenty-five years, Turnitin has been one of the most popular plagiarism detection methods available for educators and students in schools and universities, and for good reason. The internet has made finding and sharing information easier, making it easy for students to find information for their assignments—and then copy and paste those words as their own. This rise in plagiarism only increased the need for reliable detection tools like Turnitin.
Originally, back in 1998, Turnitin was a simple method for someone to submit an essay and have Turnitin scan it against its massive database of previously submitted papers, journal articles, and online sources.
From there, Turnitin would indicate how much of the text matched existing content and display a corresponding percentage. The higher the number, the higher the likelihood that it had been plagiarized. Compared to the AI-powered tools of today, Turnitin wasn’t perfect—and in many ways, it was quite simple, but it was good enough to catch blatant cases of plagiarism and, over time, built its reputation as a solid, reliable deterrent against cheating.
Image credit: Turnitin
As time went on, students grew better and better at circumventing Turnitin’s methods of detection. To combat this, the company kept refining its algorithms and expanding its database to do more than just flag copied text. The platform evolved to make it better at spotting paraphrased content or subtle rewording tricks. This kept Turnitin relevant as the internet became a veritably infinite source of information from which students could pull.
Then came the era of AI-generated content, which turned everything on its head. Tools like ChatGPT, Claude, and other generative AI models made it easy to almost instantly create an essay or assignment with just a simple prompt. This new landscape made Turnitin’s traditional methods—built around identifying copied or closely paraphrased text—nothing short of useless.
That’s because AI doesn’t "copy" text; it creates it from scratch. So, even though ChatGPT is pulling from sources, much like students who plagiarize do, the output is technically original, which required Turnitin to develop new detection strategies. So, the company expanded its capabilities to include AI detection, launching new tools that aim to spot telltale signs of machine-generated content.
Thanks to this constant advancement and the ability to pivot when AI came, Turnitin remains highly used in academic institutions worldwide. A wide range of universities, colleges, and even many high schools use it as part of their grading and integrity processes.
While professors find Turnitin valuable and students can use it preventatively, its success hasn’t been without criticism. There are always debates about whether it’s too harsh or if it invades student privacy by storing their submissions—but most agree that it’s been a net positive in discouraging plagiarism.
As of 2024, Turnitin’s transition to AI detection and its evolution reflects the company’s commitment to staying ahead of the curve. Its reputation is built on its ability to adapt, and with AI writing becoming commonplace in education, Turnitin knows it has to continue innovating. The challenge now is proving that its AI detection tools are just as reliable as its original plagiarism checks—a tall order in a world where AI is getting better at mimicking human writing every day.
How Does Turnitin’s AI Detector Work?
An AI detector’s job, like the one at Turnitin, is to determine how much of a given piece of content has been created by ChatGPT or any other chatbot; this goes for academic purposes or just general writing.
So, what are the mechanisms at play with AI detectors in general? As you might imagine, these detectors, even Turnitin’s, are built with a lot of the same technology as ChatGPT. Deep down, Turnitin’s detector relies on advanced large language models (LLMs). These models themselves are built with a neural network.
Image credit: AIMultiple Research
We could (and did) write an entire article about how AI detectors, in general, function, but the important thing is that these fancy neural networks comprising the LLM are trained on various kinds of content.
While AI chatbots are trained on vast amounts of human writing to produce writing of their own, detectors are trained on AI writing as well as human writing to be able to recognize when machine-generated text is present.
Perplexity and burstiness
A key method Turnitin uses to detect AI writing is a concept referred to as perplexity. Essentially, perplexity measures how predictable a piece of text is. Human writing, especially from talented writers, will be more varied in regard to word and phrase choices without following a pattern in the way that AI text does.
In contrast, AI-written content is often exceedingly more predictable because, after all, it was created by a computer, which thinks in ones and zeroes. High perplexity means the text feels more unpredictable and thus more likely human; low perplexity suggests AI writing, hinting that chatbots might be involved.
Turnitin’s AI detector also examines a piece of writing’s “burstiness,” which refers to the variance in sentence length and construction. Humans typically write with a mix of long, sometimes meandering, sentences and short ones. Too many long sentences, and the writing will be exhausting to read. Too many short ones, and it’ll be monotonous. Unsurprisingly, AI sentences end up being a lot more uniform, with a lot of sentences the same length and structure. Understandably, this will trigger any AI detector, Turnitin or otherwise.
So, what Turnitin does is compare any scanned documents with this training on perplexity and burstiness. The more the document has these features, the higher the likelihood that it’s been generated by an AI chatbot—at least in part or entirely.
How Accurate Is Turnitin’s AI Detector?
A study by Temple University analyzed a selection of 120 writing samples, with 30 samples each of text written by people, text written by AI, “disguised” text (in which a human tried to lightly edit AI-written text), and hybrid text, with content from both humans and AI.
The researchers learned that Turnitin’s AI detector excelled most at identifying when a person wrote a piece of text. While it did pretty well at detecting AI text when present, it wasn’t perfect: “The summative score is also somewhat effective in identifying if AI has been used to produce a text, in whole or in part. Of the 90 samples in which AI was used, it correctly identified 77 of them as having >1% AI generated text, an 86% success rate.”
So, if you’re a student who isn’t using AI at all in any of your assignments, you most likely have nothing to worry about with Turnitin’s AI detector; however, we’ll explore false positives a bit more shortly.
Likewise, if you’re trying to submit assignments that are 100% written by AI, or if you’re doing some minor editing to attempt to hide that you used AI, then Turnitin will most likely catch you, although the study shows that maybe a few instances of AI text might slip through the cracks—but it’s probably not worth the risk.
Where hybrid text is concerned—with both entirely human-written and entirely AI-written sections—the situation gets a little murky. The study established that for these texts, a detection score of anywhere between 1% and 99% would be desired. Basically, Turnitin shouldn’t believe that hybrid content is either entirely written by AI or entirely human.
Of the 30 texts, Turnitin correctly identified 13 “as being neither fully human-written nor fully-AI-generated, or 43%. Of the remaining texts, 6 were identified as 100% AI and 7 were identified as 100% human-written.”
The study concluded that while Turnitin did a great job of detecting when AI wrote an entire text, any attempt at trying to mask AI writing with editing or creating a hybrid text would be more likely to bypass the detector.
However, some of this is by design. As a tool intended for educators and students, Turnitin prefers to give students the benefit of the doubt. That’s why it’s so important for Turnitin, or any other AI detector for that matter, not to be the sole deciding factor in a student’s guilt. It’s important for teachers and professors to provide the necessary checks and balances.
Because, after all, it might be better to let the occasional cheater slip through than to slap students with erroneous false positives, which is the big concern with education-focused AI detectors. Let’s explore false positives—and negatives—in a bit more detail.
False positives
As for false positives—one of the biggest concerns with AI detection—Turnitin has tried its best to minimize instances of them occurring. But no system is 100% foolproof. In particular, non-native English speakers are more likely to experience false positives with any AI detector. After all, sentence structure and word choice are both far simpler and have far less perplexity and burstiness for writers who are still learning the craft.
Image Credit: Cell
While non-native speakers are more likely to fall victim to false positives, the instances are still relatively rare, and it’s something that Turnitin will most likely be well aware of to avoid—especially since a lot of less talented writers might have a lot of the same features of their writing that AI and non-native speakers have. Specifically, simple word choice and a lack of randomness and variance.
Outside of that, false positives should be exceedingly rare and not something that most people have to worry about, especially if professors are keeping an eye on it. So, it’s important to establish that false positives could happen, but they aren’t a huge concern.
False negatives
Similar to false positives, false negatives are when detectors fail to identify AI-generated text when it’s been presented. In Turnitin’s case, this is more of a feature than a bug as its AI detector seems geared more toward catching blatant attempts at cheating than anything more nuanced, which might easily cause innocent people to get placed within its crosshairs.
Still, though, with a bit of light editing, Turnitin’s AI detector can most likely be bypassed—maybe not completely, but enough to cast some doubt as to whether a piece of writing has been written by AI.
The Ethical Considerations of AI Detection in the Education Space
Academic dishonesty and cheating have always been a concern in schools and universities. It’s one reason Turnitin was founded over 25 years ago. No matter what precautions you take, cheaters will always try to cheat.
However, where education is concerned, the drawback is when honest students get flagged with false positives. Turnitin is designed to minimize these kinds of false positives, which is ultimately beneficial.
After all, it’s better to allow a few cheaters to slip through the cracks—since they’re only really hurting themselves—than to allow false positives to befall trustworthy students.
That’s why it’s also so important for an impartial educator to be the final word in any AI detection. Turnitin can point them in the right direction, but a human must be the final decision in any assignment.
Final Words
There’s a good reason that Turnitin is the go-to detector for so many educational institutions. It will easily catch the most blatant attempts at using AI for cheating, plus the company’s plagiarism checker has only grown faster and more efficient since 1998.
It’s also tuned in such a way that students who show academic integrity rarely fall victim to false positives even if that allows an occasional cheater to slip through the cracks. For that purpose, Turnitin has impressive levels of accuracy.
So, for students and educators, Turnitin is one of the best, if not the best, options for AI detection out there. However, for more general-purpose needs, such as publishers or writers of online content, a better option might be tools like AI Detector. These detectors will still have impressive levels of accuracy but without a greater focus on detecting AI use in education—although they’re great for that purpose, too.
Either way, AI detectors are an excellent way to determine the level of AI influence in any given piece of text.