Research paper argues that self-training in large language models leads to model collapse without external human-generated data
Hacker News · April 30, 2026
AI Summary
•A paper by Hector Zenil demonstrates mathematically that LLMs (AI systems that understand and generate text) and diffusion models undergo degenerative dynamics when external input is reduced, showing that these statistical models cannot improve themselves without continuous external anchoring.
•The article explains that using a model's own outputs to adjust itself causes the model to converge on a statistical singularity rather than improve, which is why these models need to be constantly trained with external, human-generated data to prevent collapse.
•The piece challenges the premise that LLMs possess genuine learning or intelligence, arguing instead that humans project intelligent behavior onto statistical models, and that LLMs function as 'counterfeit humans' rather than truly intelligent systems.