Which AI detectors performed best in the Authors Guild test?

Pangram and Grammarly correctly identified every human-written text as human. Originality.ai also performed well.

Which detector performed worst?

Sidekicker delivered the worst results, flagging every single article as mostly AI-generated, with two scoring 100 percent. ZeroGPT was also unreliable, reporting sometimes high AI percentages for all the human-written texts.

Why do AI detectors struggle to identify human writing?

Language models were trained on professional human writing, so professionally written texts share many of the same statistical patterns as AI output. Detection tools cannot distinguish between a human writer who has mastered the craft and a machine that has learned to imitate it, because at the level these tools operate, there may be little difference to find.

Back to articlesLarge Language Models

Large Language Models

Authors Guild test shows AI detectors vary wildly—some perfectly identify human writing, others flag all texts as AI-generated.

THE DECODER16h ago5 min read

Key takeaway

The Authors Guild tested several AI detectors on ten published articles and found stark differences in reliability. Pangram and Grammarly correctly identified all texts as human-written, while Sidekicker flagged every article as mostly AI-generated. The Authors Guild warns these tools should never alone decide whether text is human or AI, since professionally written language and AI output share similar statistical patterns—making it hard to distinguish skilled human writers from machines trained on their work.

Summaries like this, in your inbox every morning.

3 Key Points

What happened
Pangram and Grammarly correctly identified every human-written text as human in a test by the Authors Guild using ten articles published between 2020 and 2022. Originality.ai also performed well. By contrast, Sidekicker flagged every single article as mostly AI-generated, with two scoring 100 percent, and ZeroGPT reported sometimes high AI percentages for all the human-written texts.
Why it matters
False positives from unreliable detectors can cost authors their contracts and reputations. The Authors Guild warns that even the best-performing tools should never be the sole basis for any decision, and publishers should disclose their methods and give authors a chance to defend themselves. The core problem: professional writers and language models produce statistically similar text because models were trained on human-written material, making detection nearly impossible for the best-performing tools.
What to watch
The test results apply mainly to correctly recognizing human writing—Pangram and Originality.ai are tuned to minimize false positives—but do not necessarily show how well they catch AI-generated texts. The Authors Guild notes that detection tools change constantly and their accuracy cannot be taken for granted.

FAQ

Which AI detectors performed best in the Authors Guild test?: Pangram and Grammarly correctly identified every human-written text as human. Originality.ai also performed well.
Which detector performed worst?: Sidekicker delivered the worst results, flagging every single article as mostly AI-generated, with two scoring 100 percent. ZeroGPT was also unreliable, reporting sometimes high AI percentages for all the human-written texts.
Why do AI detectors struggle to identify human writing?: Language models were trained on professional human writing, so professionally written texts share many of the same statistical patterns as AI output. Detection tools cannot distinguish between a human writer who has mastered the craft and a machine that has learned to imitate it, because at the level these tools operate, there may be little difference to find.

Discussion

No comments yet. Be the first to share your thoughts!

Unable to generate summary — the article body provided contains no news content, only a newsletter registration prompt.

Nikkei AI Stocks1h ago

GitHub Copilot's shared AI harness matches rival model-vendor tools on task completion while using fewer tokens, letting developers mix-and-match models without sacrificing performance.

GitHub Copilot Blog4h ago

OpenAI will release GPT-5.6 in limited preview with Trump administration case-by-case approval, a less restrictive arrangement than the export controls imposed on rival Anthropic.

The Verge AI4h ago

The Trump administration is pressuring OpenAI to limit early access to its new GPT 5.6 model to select partners before a broader public release, shifting from its previous hands-off AI stance.

TechCrunch AI4h ago

Visa partners with AI and stablecoin firms to build payments infrastructure for autonomous agents and digital assets, diversifying revenue beyond traditional card fees.

Top Companies AI — US (1/2)7h ago

TrueFoundry acquires MLOps pioneer Seldon AI to combine infrastructure for deploying AI agents at scale, addressing a gap where only 14% of enterprises have moved AI pilots into production.

Top Companies AI — US (1/2)7h ago

Stay ahead with AI news

Get curated AI news from 200+ sources delivered daily to your inbox. Free to use.

Get Started Free

Free · takes 30 seconds · unsubscribe anytime

5 minutes a day. The AI essentials.

200+ sources · Email / LINE / Slack

Get it free →