Are AI Companies Stealing Creativity? Meta’s Copyright Battle Heats Up

Are AI Companies Stealing Creativity? Meta’s Copyright Battle Heats Up
Photo by Dima Solomin / Unsplash

In a previous blog, I discussed how the UK’s proposed AI copyright changes sparked backlash from artists like Dua Lipa, Elton John, and Paul McCartney. The government’s plan to introduce an opt-out system would allow AI developers to train on copyrighted content unless creators explicitly block it—a move many see as unfair.

Now, a new legal battle is unfolding in the U.S. A judge has ruled that authors Richard Kadrey, Sarah Silverman, and Ta-Nehisi Coates can move forward with their lawsuit against Meta, accusing the company of using their books to train AI models without permission.

While this lawsuit may seem like a step toward holding AI companies accountable, the truth is it’s almost impossible to stop big tech from using copyrighted data for training.


The Harsh Reality: AI Companies Train on Whatever They Want

1️⃣ It’s Too Easy for Companies to Use Copyrighted Data

The biggest problem is that there’s no way to detect and stop AI companies from using copyrighted data before they do it. Datasets are scraped from the internet, and once the data is inside a model, it’s nearly impossible to prove where it came from.

Even if Meta loses this lawsuit, what’s stopping other AI companies from doing the same thing? By the time legal action is taken, the damage is already done.

2️⃣ I Thought of a Solution—But It’s Not Feasible

I once thought about a system where governments could require AI companies to upload their training data for verification before using it. This system would:

🔹 Allow creators to upload their content to a government database.
🔹 AI companies would have to check their training data against this database.
🔹 If copyrighted content was found in the dataset, the AI company would be required to pay a certain fee to the creator.

At first, this seemed like a fair way to ensure creators get paid for their work. But after thinking more about it, I realized it’s not practical at all.

3️⃣ Why This System Wouldn’t Work

🚫 The sheer amount of data is too massive. There are billions of copyrighted works online, and cross-checking every dataset would be technically impossible at scale.

🔐 Companies could use encryption to bypass detection. AI firms could obfuscate their training data, making it unreadable by verification systems. If they don’t want to be caught, they’ll find a way to hide it.

⚠️ Big tech moves too fast for regulations to keep up. By the time a legal framework is in place, AI companies will have already trained their models on whatever data they wanted.


A large group of yellow and white cubes
Photo by Maxim Landolfi / Unsplash

The Unavoidable Truth: AI Will Train on Everything

At this point, I believe it’s inevitable that AI companies will continue using whatever data they can get their hands on

So, what’s the solution? Maybe there isn’t one. Perhaps the future isn’t about preventing AI from using copyrighted data, but about figuring out how creators can still benefit from it—whether that’s through licensing deals, revenue-sharing models, or entirely new business structures.

Leave me a comment on X(Former Twitter)


Sources

  • Ha, A. (2025, March 8). Judge allows authors’ AI copyright lawsuit against Meta to move forward. TechCrunch. Retrieved from TechCrunch.