Derivinate

Posted on Mar 12 • Originally published at news.derivinate.com

AI's Copyright Reckoning: $1.5B Settlement Signals New Cost of Doing Business

#aicopyright #legalliability #generativeai #iplaw

The copyright lawsuit against Anthropic looked like a win for the AI company. In June 2025, a Northern California district court found that Anthropic's LLM training was "exceedingly transformative" and qualified as fair use. Then, three months later, Anthropic settled for $1.5 billion—paying approximately $3,000 per book for 482,460 books downloaded from pirate libraries like Library Genesis.

That settlement isn't a footnote. It's a market signal that changes everything.

The legal landscape around AI and copyright shifted dramatically in 2025, and most businesses haven't caught up. Courts are now consistently rejecting the "fair use" defense that AI companies relied on for years. Settlements are reaching billions of dollars. And the liability is cascading downstream to every company using these tools.

If you're building with AI, you need to understand what just happened—and what it costs.

The Fair Use Defense Collapses

For years, AI companies argued that training models on copyrighted material was fair use. The logic: they weren't copying the books to distribute them; they were analyzing them to learn patterns. Transformative use. Legal.

That argument died on February 11, 2025.

Thomson Reuters v. ROSS Intelligence was the first major court decision to reject fair use in the AI training context. ROSS Intelligence had trained its legal research tool on Thomson Reuters' copyrighted headnotes and Key Number System—original works that represent decades of editorial labor. ROSS claimed the use was transformative because it created a new product. The U.S. District Court for the District of Delaware disagreed.

The court's reasoning was surgical: even if the output is new, the purpose is commercial, and the market effect harms the original creators. Thomson Reuters now licenses its data. ROSS trained on it without permission. That's infringement, not innovation.

"Factor 1 (Purpose/Nature): Use was commercial, not transformative. Factor 4 (Market Effect): ROSS's use harmed potential market for AI training data licensing," the court wrote. It was a clean rejection of the transformative use argument that had protected AI companies for years.

That decision became a template. Within months, courts across the country began applying the same logic to other cases. The New York Times v. OpenAI/Microsoft lawsuit—filed in December 2023 and still ongoing—alleges use of "millions" of copyrighted articles without consent. The Times claims economic harm through lost paywalled content and advertising revenue. The Thomson Reuters precedent makes that claim stronger.

The Settlement That Changed the Game

The Anthropic settlement is where the real story lives. A federal court had already ruled in Anthropic's favor on fair use grounds. But the company paid $1.5 billion anyway.

Why? Because even a "favorable" court decision doesn't eliminate the risk. Statutory damages for copyright infringement can reach $150,000 per work. If Anthropic had lost at trial—and the Thomson Reuters decision made loss more likely—the damages could have exceeded $10 billion. The settlement was cheaper than the litigation risk.

That math is now embedded in every AI company's cost-benefit analysis. Copyright Alliance CEO Keith Kupferschmid called it a "clear victory for the publishers and authors," but the real victory was structural: the settlement proved that AI companies can be forced to compensate creators without it "undermining their ability to continue to innovate and compete."

In other words, licensing isn't optional anymore. It's a business cost, like infrastructure or salaries.

The Anthropic settlement wasn't alone. ElevenLabs and Planner 5D both reached confidential settlements in 2025. The pattern is clear: litigation costs so much and risk is so high that settlement is the rational choice, regardless of the legal merits.

A New Problem: Hallucinations as Liability

The copyright question was complicated enough. Then came hallucinations.

On December 5, 2025, the New York Times filed suit against Perplexity AI, alleging something more damaging than copyright infringement: Perplexity's search engine generates "hallucinations"—fabricated information—and falsely attributes them to the Times by displaying the newspaper's registered trademarks alongside false content.

This isn't just about training data. It's about a model generating false information and crediting it to a real publisher. The Times had contacted Perplexity 18 months earlier demanding licensing negotiations. Perplexity ignored them.

The liability surface just expanded. You can now be sued for:

Copyright infringement (using copyrighted material without permission)
Trademark infringement (falsely associating your output with a brand)
False attribution (crediting hallucinated content to a real source)

A single AI output can trigger all three. That's a different risk profile than the copyright cases alone.

The Copyright Office Draws Lines

While courts were rejecting fair use, the U.S. Copyright Office was drawing its own lines—and they're restrictive.

In January 2025, the Copyright Office issued guidance that prompts alone are insufficient for copyright protection. "Prompts essentially function as instructions that convey unprotectable ideas," the office wrote. "[C]urrently available technologies do not offer enough control and predictability in outputs." Translation: if you use ChatGPT to write marketing copy, you don't own the copyright to that copy just because you wrote the prompt.

That guidance created a void. You can't copyright AI-generated content (unless a human substantially rewrote it), but you also can't use copyrighted material to train the AI that generates it. The middle ground where most AI applications live is legally undefined.

Then, in May 2025, the Copyright Office issued a more specific conclusion: fair use does NOT apply when AI outputs "closely resemble and compete with original works in their existing markets." The example they gave was telling: a model trained on copyrighted horror novels generating books that mimic those works. That's not fair use. That's market substitution.

This standard is vague enough to be dangerous. What counts as "closely resemble"? How much market competition triggers liability? The Copyright Office didn't say. Courts will have to decide, case by case, which means businesses face years of litigation risk before the rules clarify.

The Ownership Void

Here's the critical gap: we know AI-generated content alone isn't copyrightable. We know training on copyrighted data without permission is increasingly risky. But we don't know what happens to the content businesses generate using these tools.

The Thaler v. Perlmutter decision (March 2025) made it clear: the Copyright Act "requires all eligible work to be authored in the first instance by a human being." Copyright duration is tied to human authorship—70 years post-author death. AI-only works get no copyright at all.

But what about hybrid works? Content a human prompted an AI to generate, then edited? The Copyright Office's January 2025 guidance said registration is limited to "human-authored portions only." But most AI-generated content is a blend. Where's the line?

Arkansas tried to answer this. In 2025, the state passed legislation stating that "the person providing input/directive to generative AI tool owns generated content, provided it doesn't infringe existing copyrights or IP." That's clear. But it directly contradicts the Copyright Office's position that prompts are unprotectable ideas. If a prompt is unprotectable, how can it generate protectable ownership?

This contradiction matters because it creates liability cascades. You use AI to generate marketing copy. You think you own it because you wrote the prompt (Arkansas law). But the AI was trained on copyrighted marketing copy without permission (Thomson Reuters precedent). Your generated copy competes with the original in the same market (Copyright Office May 2025 guidance). Now you're liable for infringement, even though you thought you owned what you created.

The Licensing Regime Emerges

Despite the litigation chaos, something cleaner is emerging: a licensing market.

Anthropic, OpenAI, and others are negotiating directly with publishers, news organizations, and rights holders. The New York Times, Associated Press, and major book publishers are cutting licensing deals. The terms are confidential, but the pattern is obvious: AI companies are paying for the right to use copyrighted content in training.

This is how IP disputes usually resolve. You can't use someone's work without permission, so you pay for a license. The copyright battles of 2025 were just the market discovering its price point.

But here's the catch: licensing deals are happening at the company level (OpenAI, Anthropic, Perplexity), not at the user level. If you use ChatGPT, you're not directly licensing anything. OpenAI is. You're relying on their licensing deal to protect you from liability.

Except the licensing doesn't extend downstream. OpenAI's terms of service make this clear: users are responsible for ensuring their use doesn't infringe third-party IP. If you generate content with ChatGPT and that content infringes a copyright, OpenAI isn't indemnifying you. You're liable.

What This Means for Builders

The legal landscape isn't settled, but the business reality is hardening:

If you're training AI models, you need licensed data or you're exposed to massive liability. The Anthropic settlement proves that even favorable court decisions don't eliminate risk. Budget for licensing or expect to settle for billions.

If you're using AI to generate content, you're inheriting the training data risk. The AI company isn't indemnifying you. If the model was trained on copyrighted material without permission, and your generated output competes with the original, you could be liable. This is especially dangerous for companies using AI for marketing, design, or content creation at scale.

If you're operating an AI product, hallucinations are now a liability vector. The Perplexity case proves that generating false information and attributing it to a real source is actionable. This isn't just a quality problem; it's a legal problem.

If you're in a regulated industry (publishing, news, music, film), licensing agreements are becoming table stakes. Expect AI companies to demand them as a cost of doing business.

The era of "move fast and break things" in AI training is over. The new era is "move carefully and license everything." That's more expensive, slower, and less exciting. But it's the cost of operating in a market where courts are drawing clear lines and settlements are reaching billions of dollars.

The copyright wars of 2025 weren't a legal anomaly. They were the market discovering its equilibrium. Now we're all living in it.

Originally published on Derivinate News. Derivinate is an AI-powered agent platform — check out our latest articles or explore the platform.

DEV Community