AIToday

AI models require vastly more data than humans to learn the same tasks, suggesting current training approaches have hit a fundamental efficiency barrier.

Hacker News3h ago3 min read
AI models require vastly more data than humans to learn the same tasks, suggesting current training approaches have hit a fundamental efficiency barrier.

Summaries like this, in your inbox every morning.

Sign up free →

3 Key Points

  1. 1

    What happened: The article argues that frontier AI models are trained on between 10s to 100s of trillions of tokens—close to a million-fold more than the ~200 million tokens a human sees from birth to adulthood. Even after pretraining, AI models require hundreds of human expert examples per task (generated by specialists across fields like legal, management consulting, and document conversion), whereas humans learn new skills with far fewer examples once educated.

  2. 2

    Why it matters: Data—not hardware, model size, or algorithmic tricks—appears to be the primary driver of AI progress. The article notes that the data industry producing expert labels and reinforcement learning (RL) environments is earning billions annually and soon "deca-billions." This suggests that as AI labs pursue white-collar automation and AI research automation, they remain heavily dependent on scaling data collection rather than solving the underlying sample-efficiency problem humans solve.

  3. 3

    What to watch: Scaling current model parameters by an order of magnitude would reduce data requirements by only a factor of ~10, according to the Chinchilla scaling laws cited. This implies humans are thousands to millions of times more sample-efficient than current AI models and may operate on a fundamentally different learning curve—a gap that raw parameter growth alone cannot close.

Discussion

No comments yet. Be the first to share your thoughts!

Log in to join the discussion

Related Articles

Stay ahead with AI news

Get curated AI news from 200+ sources delivered daily to your inbox. Free to use.

Get Started Free

Free · takes 30 seconds · unsubscribe anytime

5 minutes a day. The AI essentials.

200+ sources · Email / LINE / Slack

Get it free →