AI content floods the internet, making human-generated text a precious resource

The internet is increasingly being filled with AI-generated content as millions of people use programs like ChatGPT to create marketing campaigns, blog posts, and other material.

A team of researchers from the United Kingdom and Canada tried to figure out how all that synthetic text might influence the performance of large language models trained in the future. They found that programs relying on AI-generated content quickly lost their capabilities and became vulnerable to a phenomenon they dubbed “model collapse.”

“Just as we’ve strewn the oceans with plastic trash and filled the atmosphere with carbon dioxide, so we’re about to fill the Internet with blah,” wrote Cambridge University professor Ross Anderson, who is also a co-author of the study. “This will make it harder to train newer models by scraping the web, giving an advantage to firms which already did that, or which control access to human interfaces at scale.”