AI and the Ethics of Learning

July 9, 2025

AI and the Ethics of Learning: Meta, Books, and the Future of Publishing

Earlier this year, news broke that Meta had used tens of thousands of books to train its AI models without permission from the authors or publishers. This has reignited major debates about consent, intellectual property, and what it means to "learn" in the age of machine intelligence.

At the heart of the controversy is a dataset known as "Books3" — a massive repository of texts scraped from the internet, many of which are copyrighted. Meta admitted that this and other similar datasets were used to help train its large language models, including LLaMA. The authors whose work was included had no knowledge, let alone a say in the matter.

High-profile lawsuits have now been filed in both the US and UK, with writers arguing that their work has been exploited to fuel a technology that could eventually undercut their own livelihoods. While tech companies argue that using this material qualifies as "fair use" or "fair dealing," many authors see it as a fundamental violation of their rights.

This story matters because it touches every part of the publishing industry. If AI models are being trained on human creativity without consent, where does that leave the people who make a living from writing? And what role should publishers play in protecting their authors?

Some UK publishers have begun issuing statements and tightening up contracts to explicitly prohibit the use of their content for AI training. But for many, this is too little, too late. The data has already been scraped. The models have already learned.

Whether this becomes the defining copyright battle of the decade remains to be seen. But one thing is clear: the publishing industry can no longer afford to ignore the implications of AI. Writers and publishers alike must work together to ensure the future of storytelling is fair, ethical, and transparent

Back to blog

Subscribe to our emails