The Atlantic launches AI Watchdog, an investigation into how tech companies use copyrighted content to train generative AI systems without permission.

Hacker News6h ago3 min read

Summaries like this, in your inbox every morning.

3 Key Points

1
What happened: The Atlantic announced AI Watchdog on September 10, 2025, described as an ongoing investigation to reveal the inner workings of generative AI. The project has already published findings on multiple fronts: at least 15 million YouTube videos have been used by tech companies to train AI, Common Crawl has been funneling paywalled articles to AI developers, and large language models appear to memorize and reproduce training data rather than truly learn from it.
2
Why it matters: Tech companies building AI systems often rely on copyrighted material—music, articles, videos, and books—without compensating creators or obtaining permission. The investigation surfaces a core tension in the AI industry: companies aggressively protect their own intellectual property while freely using others' content to train their models. This raises legal and ethical questions about whether current copyright law can address AI-driven data use at scale.
3
What to watch: The investigation is ongoing, with articles examining YouTube's quiet use of AI to alter uploaded content, judges' uncertainty about AI book piracy, and the gap between what tech companies claim about their training practices and what actually happens behind the scenes. The Atlantic is systematically documenting the datasets and methods used to build major AI tools.

No discussion yet for this article

Get curated AI news from 200+ sources delivered daily to your inbox. Free to use.

Free · takes 30 seconds · unsubscribe anytime

5 minutes a day. The AI essentials.

200+ sources · Email / LINE / Slack