The News
Comedian Sarah Silverman is among a trio of writers who filed a lawsuit against OpenAI and Meta on Friday, claiming the companies’ artificial intelligence models were illegally trained on copyrighted material.
Alongside writers Chris Golden and Richard Kadrey, Silverman’s joint lawsuits accuse the tech firms of using illegal online “shadow libraries” that torrent hundreds of thousands of books to teach their AI systems.
We’ve curated insightful analysis on the lawsuit and the larger legal questions surrounding AI and copyright.
Insights
- The lawyers handling the suit also filed a similar claim against OpenAI on behalf of two other authors last month, and have previously accused AI firms of illegally ripping off the work of programmers and artists in its training data. These suits, which present a headache for companies like OpenAI and Meta, are ”challenging the very limits of copyright” and could take years to resolve in the courts. — The Verge
- In the recent cases, the authors take issue with the third-party training data that the models use. But these lawsuits provide no direct evidence that OpenAI and Meta are relying on the illegal web databases of the copyrighted books, the University of Sussex’s Andrés Guadamuz writes on his blog. The suit offers lengthy ChatGPT-generated summaries of the authors’ books as proof that the AI models were trained on the works. But Guadamuz, who also edits the Journal of World Intellectual Property, points out that it’s possible that information was found elsewhere online.
- AI copyright claims take us into ”uncharted legal territory″ that experts say was “more or less inevitable,” Rolling Stone reported. Another recent suit alleges that OpenAI trained its models on private and personal information from internet users without their knowledge, taking aim at the larger, “existential” threat it claims AI poses.
- Experts say OpenAI and Meta are likely to offer a defense based on fair use, which allows for the limited use of copyrighted material. The companies could argue they’re using the books for a different purpose than the authors are, by producing summaries rather than providing the original text, entertainment and copyright lawyer Marc Ostrow explained. “That is a very tricky question because with fair use, there are no hard-and-fast rules. Everything has to be done on a case-by-case basis,” he told Semafor.
- Is it possible the courts could block entire generative AI systems if the copyright claims are successful? Brian Frye, a law professor at the University of Kentucky, is skeptical that could happen, especially given that AI chatbots have many uses that clearly don’t violate copyright law.