At the end of 2022, Open AI’s release of Chat GPT became the topic of conversations among academic, marketing, advertisement, and copywriting professionals worldwide. AI writing software like Chat GPT can write anything a user requests, such as blogs, articles, children’s books, and academic papers. While people continue to debate the ethical usage of this relatively new technology, others are working hard to ensure the public can identify a text written by a machine versus a human.
The collective effort to create AI-based software to detect AI-written text comprises students, researchers, and even the creators of the controversial AI-writing software. Many apps, programs, and paid services are currently available, promising to identify AI-generated text to a certain degree. But how dependable and accurate are they?
This article discusses how these AI-detection software work, how you can detect AI-written text, and how accurate and reliable these methods are.
How Does AI-Detection Software Work?
AI-detection software works similarly, except it learns the differences between human-written and AI-written text. Most companies develop their software by inputting AI-generated text from existing AI-writing software, such as Jasper, Chat GPT, GPT2, and more. Ideally, the AI-detection software will be able to accurately and consistently determine whether a person or a machine wrote a text.
Why Is It Necessary to Detect AI-Written Text?
The main reason to detect AI-written text is to prevent plagiarism. That is particularly important in an academic context because one of the objectives of educational institutions is to teach students how to write effectively. AI writing software poses a considerable challenge to educators in enforcing academic integrity in their classrooms because these types of software can produce undergraduate-level text.
The ethical line is clear in this context due to many academic institutions’ plagiarism policies, stating that any writing a student did not produce themselves or did not provide relevant references for is plagiarism. Therefore, students abusing AI technology to write an assignment or paper for them violates plagiarism policies because they did not write it themselves. However, proving that a student has used AI to complete an assignment is still challenging unless they admit their wrongdoing.
The line becomes a bit blurred in the professional world due to the search engines’ policies. For example, Google has stated that AI-generated content or any content written solely to manipulate search engine results page (SERP) rankings violates their spam policies and will be penalized. However, in the same statement, they say they are not focused on how content is produced (AI or otherwise). Their focus is to “reward original, high-quality content that demonstrates qualities of what we call E-E-A-T: expertise, experience, authoritativeness, and trustworthiness.” Meaning that content written by a person or AI can perform and rank well if it meets their E-E-A-T criteria.
However, people online are already reporting how AI-written text is either plagiarism or creating issues for their publications. For example, writer Alex Kantrowitz from Big Technology went public after one of his articles was plagiarized by The Revolutionist, a recent business endeavor built on the back of AI writing software. In another instance, Neil Clarke, creator of Clarkesworld, determined that 500 of 1,200 submissions for his magazine in February 2023 were AI-generated. This could be attributed to the rise of Chat GPT, which was released just a few months before these cases.
3 Examples of AI-Detection Software
1. OpenAI’s Classifier
OpenAI, the creators of Chat GPT, also created a classifier to help people detect whether a person or a machine wrote a text. They state that they made it “available as a free tool to spark discussions on AI literacy.” It provides a probability score on how likely a text is written by AI, ranking a text as very unlikely, unlikely, unclear if it is, possibly, or likely AI-generated. They do not claim this product to be foolproof but as a way for people to potentially identify AI-written text.
To use the classifier, the text must be at least 1,000 characters. Just paste the text in the box, click “submit,” and it will rank the text accordingly.
In the above image, text created by Chat GPT on how to write blogs for a business was put into the classifier, which ranked it as “likely AI-generated.” Correct!
In this image, you can see a text copied from Proofed’s blog on how to write blogs for a business in OpenAI’s classifier. It was determined as “very unlikely AI-generated.” Also correct!
Although the classifier proved to be correct for both samples, other reports from online users highlight its inconsistency and unreliability, especially when dealing with text that is written by a person and AI.
CopyLeaks’ AI Content Detector
CopyLeaks’ AI Content Detector boasts about its ability to use AI to detect AI. What they describe as “fighting fire with fire.” It develops its models using AI-generated content from Chat GPT, GPT3, GPT2, Jasper, and others. It aims to continually update and enhance its models as newer AI-writing software emerges. They also claim that their product is more than 99% accurate in determining whether a text is or isn’t AI-generated. But let’s see for ourselves how it performs.
The same samples used to test OpenAI’s classifier were used to test CopyLeaks’ AI Content Detector. In the above image, its AI Content Detector accurately identifies the text as AI-generated, with more than 99% probability.
Here, its AI Content Detector accurately identified the human-written sample, but only with a 73% probability. Other reviewers online have proved that CopyLeaks’ AI detector is less accurate with more testing and text variations.
Content at Scale’s AI Detector
Content at Scale is probably the most problematic AI writing service on the market. Like OpenAI, they offer AI writing services (Content at Scale starts at $500 a month for 20 articles) and free AI detection. The main reason it is troublesome is that if you use its AI-writing services, it promises that the produced text can pass AI-detection software (like those used by Google), so your content will still rank high on SERP. And according to online reviews, it’s keeping its promise.
Content at Scale claims that its AI-detection software is the best available option and offers plagiarism detection. However, the results prove otherwise.
The above image shows the exact AI-generated text used for the other two AI detection software. Content at Scale performed the worse, rating the sample as “likely both AI and human” when it was entirely AI-written. On the right side, you can see how it highlights each sentence (red, yellow, orange, and no highlight), explaining where it thinks the text is AI-generated.
When given the human-written sample, Content at Scale did a better job determining that it was written by a human, giving it an 86% probability. Other online reviews of Content at Scale further highlight how it is not the best on the market and that its product is inconsistent in recognizing whether AI or a human wrote a text.
How to Manually Evaluate AI-Generated Text
You can look for several things if you want to rely on your expertise to determine the legitimacy of a piece of writing. However, most of these features are also features of poor writing in general, so use your judgment.
Things to look for:
- Repetition in word usage, phrasing, or ideas.
- Awkward or unnatural sentences or wording.
- Language lacking complexity or emotion.
- Conflicting information and discrepancies in the text.
- Illogical paragraphing or paragraphs lacking unique transitions.
- Overall structure may be unnatural or lack style.
- Grammar errors, missing words, poor punctuation, and random capitalization.
- Out-of-date information or lack of current information.
- Predictable writing, lacking originality—nothing interesting or surprising about the text, content, structure, or style.
- Lack of sourcing or hyperlinks or hyperlinks and sources that are not logical or credible.
How Reliable and Accurate Are These Methods?
Overall, the methods for detecting AI writing discussed in this article (and others) are not accurate or reliable. While they can sometimes hit the mark with a text, they can miss it on the next one. Identifying a text on your own has the same success rate and will largely depend on the person in question. For example, a professor who suddenly receives a concise, articulated, and seemingly well-written paper from a student who has turned in subpar essays all semester could interpret that as a red flag and possible use of AI writing software.
Check out this article to learn more about proofreading AI-generated content.
At the end of the day, AI detectors are trying to achieve human intuition by recognizing patterns. Is the text exciting, suggesting an original idea, or presented uniquely? If yes, you can feel confident it’s not AI-generated. Conversely, if you read something predictable, unimaginative, and, well, boring, it could be written by AI, or it could be poor writing. What’s the difference? We’ll let you decide.
Interested in discovering more about how AI technologies are changing the way we write and produce content? Read our article “A Writers Guide to Using AI: All You Need to Know” to learn more.