AIToday

Meta contractors posed as minors to test rival AI chatbots on harmful topics

WIRED AI6h ago5 min read
Meta contractors posed as minors to test rival AI chatbots on harmful topics

Key takeaway

Meta contractors posed as minors online and tested how ChatGPT, Gemini, and Character.AI responded to harmful prompts about suicide, self-harm, eating disorders, and other sensitive subjects—sending over 45,000 prompts without the companies' knowledge. While Meta defended the work as routine safety testing, experts and former contractors questioned whether the large-scale, secretive effort with fake child accounts blurred the line between legitimate safety evaluation and competitive intelligence gathering, and whether it violated the competitors' terms of service.

Summaries like this, in your inbox every morning.

Sign up free →

3 Key Points

  • What happened

    Hundreds of contractors working for Meta, managed by contractor Covalen, created fake under-18 accounts and sent over 45,000 prompts to OpenAI's ChatGPT, Google's Gemini, and Character.AI between August 2025 and April 2025. The prompts, designed to test how the chatbots handled requests about suicide, self-harm, eating disorders, sex, drugs, and other high-risk subjects, included images of pills, knives, and nooses. The companies being tested were not aware of the effort.

  • Why it matters

    The testing appears to violate the terms of service of all three competitors—OpenAI bars unsolicited safety testing and efforts to bypass safeguards, Google prohibits attempts to bypass safety filters outside its authorized programs, and Character.AI prohibits harmful and exploitative content. Meta framed the work as routine "comprehensive AI safety benchmarking," but safety experts and former contractors noted that the scale, secrecy, use of dummy child accounts, and blending of safety evaluation with competitor benchmarking raised concerns about whether it amounted to gathering competitive intelligence under a safety guise.

  • What to watch

    OpenAI said it is "looking into the issue," Character.AI confirmed the testing violated its terms and policies, and Google said it did not authorize the testing and lacks sufficient information to determine whether it violated its terms. The incident highlights ambiguity around what constitutes acceptable AI safety evaluation versus covert competitor testing.

FAQ

Which AI chatbots were targeted in this testing?
OpenAI's ChatGPT, Google's Gemini, and Character.AI were targeted. The effort was known internally as Cannes and was managed by Meta contractor Covalen.
What kinds of prompts were contractors asked to send?
A spreadsheet of 3,748 prompts reviewed by WIRED included hundreds focused on suicide and self-harm, hundreds on eating disorders, at least 239 involving sex or romance, and others on drugs, profanity, and racial slurs. Many prompts were written from the perspective of children or teenagers in crisis.
How did the companies respond?
Character.AI confirmed the testing violated its terms of service and policies. OpenAI said it is "looking into the issue." Google said it had not authorized the testing and did not know its purpose, and that it lacked sufficient information to determine whether the effort violated its terms of service.

Discussion

No comments yet. Be the first to share your thoughts!

Log in to join the discussion

Related Articles

Stay ahead with AI news

Get curated AI news from 200+ sources delivered daily to your inbox. Free to use.

Get Started Free

Free · takes 30 seconds · unsubscribe anytime

1 minute a day. The AI essentials.

200+ sources · Email / LINE / Slack

Get it free →