What Are the Best AI Detectors?
6 Popular AI Detectors Are Put to the Test!
Detecting content written by artificial intelligence (AI) is a hot topic, and if you’re anything like us, you might be keen to know which are the best AI detectors out there. After all, they could be handy for answering questions like…
- Is your natural writing style likely to result in false accusations of using AI to write on your behalf?
- Are your students using AI to cheat?
- If you use AI to write for you, will people know that you did?
We reviewed six leading AI detectors to see how they perform. We put them to the test against a dozen articles (eight human-generated articles and four AI-generated articles). After organising the results, clear patterns emerged—and you can use them to help you determine whether or not the content you’re reading was probably written by a human.
This knowledge is not just academic. It’s also crucial in practical scenarios, to ensure you don’t end up like this university professor who went viral for all the wrong reasons. (He incorrectly accused half his class of using AI, which put them at risk of failing his course.)
Safe use of AI in medicine
If you’d like to be in the know about how to use AI safely in medicine (in a way that won’t make you go viral for all the wrong reasons!), Medmastery has a free course for you: ChatGPT Essentials! Sign up for a trial account to get access to the entire ChatGPT Essentials course! You’ll also get access to selected webinars, plus, the first chapters of over 120 additional accredited courses and workshops!
Key things you need to know when using tools for detecting AI content
- Never put 100% of your trust in these tools. They aren’t perfect.
- As the large language models we use for writing get smarter, they may also get better at avoiding detection.
- Some people have a writing style that’s more likely to result in incorrect accusations that they used AI! So, use caution when interpreting results from these tools.
Last, but not least, it goes without saying that you need to read the tool’s documentation to verify that the results actually mean what you think they mean. Often it’s intuitive, but not always.
Method
AI detectors reviewed
Here are the six popular AI content detectors that we tested:
Articles reviewed
Human-generated articles: We tested 8 pieces of content written by 7 human authors on a variety of topics. Six articles were written for physicians; one for the lay person…and one article had nothing to do with medicine at all.
We were curious if the choice of topic or degree of technicality would make a difference to whether the detectors could correctly identify authorship. All articles were written before ChatGPT was released to the public in November 2022.
8 Human-generated articles tested
Article Number | Topic | Intended audience | Year of Publication |
---|---|---|---|
1 | Shoulder dislocations | Healthcare professionals | 2020 |
2 | Spinal infections | Healthcare professionals | 2021 |
3 | Swine flu pandemic | Healthcare professionals | 2019 |
4 | Diuretics | Healthcare professionals | 2015 |
5 | Choosing medicine as a career | Healthcare professionals | 2014 |
6 | The common cold | Lay people | 2016 |
7 | ECG handout | Healthcare professionals | 2017 |
8 | Travel planning | Lay people | 2012 |
4 AI-generated articles tested
Article Number | AI author | Topic | Comments |
---|---|---|---|
9 | ChatGPT | Common cold | AI rewrite of article #6 above. |
10 | ChatGPT | Spinal infections | An original ChatGPT creation. |
11 | Gemini | Spinal infections | An original Gemini creation. |
12 | Gemini | Spinal infections | We asked Gemini to rewrite its original spinal infection article (#11) in a way that would evade AI detectors. |
Results
Interpreting results from AI-content detection tools
Generally speaking, AI detectors analyse the text you provide and then indicate the probability that a human or an AI generated the text.
For example, if the detector says “50% AI”, that doesn’t mean an AI wrote half the text. What it actually means is that the tool thinks there’s a 50% chance an AI wrote the text and a 50% chance a human wrote the text. In other words, the tool isn’t very sure about who (or what) wrote it.
Below are the results for the ‘percentage probability‘ of content within each article being AI generated, ranging from HUMAN (AI 0%) to Artificial intelligence (AI 100%).
Human articles
Human Article | Sapling | GPTZero | Content at Scale | Copyleaks | Originality.ai | Undetectable AI |
---|---|---|---|---|---|---|
1 | AI: 57.1% | AI: 0% | human | human | AI: 0% | human |
2 | AI: 50.6% | AI: 2% | human | human | AI: 96% | human |
3 | AI: 3.6% | AI: 3% | human | human | AI: 6% | human |
4 | AI: 2.1% | AI: 1% | human | human | AI: 3% | AI |
5 | AI: 0% | AI: 2% | human | human | AI: 0% | AI |
6 | AI: 28.1% | AI: 1% | human | human | AI: 29% | AI |
7 | AI: 0% | AI: 1% | human | human | AI: 0% | human |
8 | AI: 3.9% | AI: 1% | human | human | AI: 0% | human |
Artificial Intelligence-generated articles
AI Article | Sapling | GPTZero | Content at Scale | Copyleaks | Originality.ai | Undetectable AI |
---|---|---|---|---|---|---|
9 | AI: 100% | AI: 89% | human | human | AI: 100% | AI |
10 | AI: 99.7% | AI: 83% | “hard to tell” | “AI content detected” | AI: 96% | AI |
11 | AI: 100% | AI: 100% | human | “AI content detected” | AI: 100% | AI |
12 | AI: 99.7% | AI: 81% | human | “AI content detected” | AI: 100% | human |
Conclusion
The best AI detectors
The most accurate AI content detector
Based on our testing, GPTZero was the most accurate for detecting AI content as it correctly identified the origin of all eight human-generated articles and all four AI-generated articles.
The runner up
Copyleaks was almost flawless. It correctly classified all eight pieces of human-generated content. And only one of the AI-generated articles fooled it.
IMPORTANT: Our test was relatively small, so please don’t use this info to assume you can completely trust the results from this—or any—AI detector. Our sample size was relatively small so it’s quite possible that even GPTZero may have eventually made mistakes if we fed it enough articles.
You can increase the likelihood of coming to an accurate conclusion about the origin of an article if you run it through multiple AI detectors… but even if multiple detectors predict it’s likely AI-generated, we’d merely classify that content as “highly suspicious” until we could get more evidence.
3 recommended uses of AI detectors
1. When reading content on an unfamiliar website, you can use tools for detecting AI content to help determine whether a human (preferably with experience in the subject matter!) likely wrote the content.
Our intention isn’t to say that AI-generated content is inherently bad. However, without human oversight it may contain errors, and you need to know whether you should be on “red alert” for them. For example, here’s a viral case where AI-generated tutorials contained instructions about software features that don’t even exist. That would be a mere annoyance for software users. But if we were to use a similar approach to generating medical content, results could obviously be disastrous and even life-threatening.
2. When evaluating someone else’s writing, you may find AI detectors useful in helping you figure out whether or not they had an AI do the writing for them. However, remember not to completely put your trust in any AI-detector because they sometimes make mistakes.
3. Finally, you may find it useful to put your own writing through an AI detector just to see how these tools classify it. After all, if other people might look, you might as well know what they’re going to find!
References
Educational Resources
- Guilleminot S. AI in Healthcare. LITFL
AI in HEALTHCARE
Want to become a pro at prompting, and consistently get usable results? Be sure to check out Medmastery’s AI prompting course. Learn techniques to apply to the plethora of AI resources in constant development.
BSc.Pharm (University of Manitoba), Pharmacist and Medical Writer