Twin California Rulings Mark a Turning Point for AI‐Copyright Fair Use
By: Moish E. Peltz, Esq. and Steven T. Cooper, Esq.
In the space of forty-eight hours, two judges of the Northern District of California issued detailed, partially contrasting opinions on whether large language model (“LLM”) training that copies entire books without permission, and whether digitizing and retaining a digital library of such books, is protected by fair use.
- Bartz v. Anthropic (Alsup, J., 23 June 2025) found the ingestion of seven million books “spectacularly transformative” and therefore fair use, while leaving Anthropic to face a separate damages trial over its wholesale downloads from pirate sites. Full decision: here.
- Kadrey v. Meta (Chhabria, J., 25 June 2025) likewise granted Meta summary judgment on fair use for training Llama with thirteen authors’ works, but only after castigating the plaintiffs for failing to produce evidence of market harm. The court explicitly rejected a “transformative use shortcut” and declared the fourth fair use factor—the effect on markets—“the single most important element.” Full decision: here.
Taken together, the opinions supply the first judicial roadmap for evaluating AI training under 17 U.S.C. § 107, confirm that fair use analysis survives intact in the generative AI era, and signal that future plaintiffs must marshal credible, data-driven evidence of market substitution in order to survive summary judgment.
What Anthropic Won—and Still Risks
Judge Alsup’s opinion is unabashedly enthusiastic about machine learning: internalizing expressive works to extract statistical patterns is no different, he writes, from the way “schoolchildren learn to write well,” and therefore “does not implicate the competitive or creative displacement that concerns the Copyright Act.” On that basis fair use factors one and three (the purpose and character of the use, and the amount and substantiality of the portion used in relation to the copyrighted work) weighed heavily for the defendant, and factor four (the effect of the use on the potential marked or value of the copyrighted work) favored Anthropic because no infringing outputs were alleged and the court did not find that potential competition from new AI-generated works was protected under the Copyright Act. Crucially, however, Judge Alsup severed the question of how Anthropic obtained its corpus. Downloading from shadow libraries may yet expose the company to statutory damages, and the opinion invites a jury to decide whether the piracy was willful. The clear message: lawful acquisition is not strictly required for fair use, but it remains a costly litigation vulnerability.
How Meta Prevailed—and Narrowed the Precedent
Judge Chhabria agreed that LLM training is “highly transformative,” but refused to let that finding “blow off” the economic analysis. He held that the authors’ two theories of market harm—verbatim regurgitation and lost licensing fees—were “clear losers” because the record showed Llama could not reproduce more than fifty consecutive words and because the Copyright Act does not guarantee a market to license works for the very purpose of transformation. On that deficient record, factor four tipped to Meta, and the court granted summary judgment.
The opinion, however, is narrower than Anthropic’s. It applies only to the thirteen named authors; it preserves a live claim that Meta illegally distributed works while torrenting its corpus; and, expressly countering the view of Judge Alsup, it stresses that better evidence of indirect market dilution—e.g., AI-generated romance novels crowding out human ones—could swing the analysis the other way.
Convergences and Fault Lines
- Transformative purpose is necessary but not sufficient. Both judges embrace the human learning analogy, yet Judge Chhabria warns that transformativeness alone cannot “inoculate” AI developers. Expect future defendants to supplement the transformative use argument with empirical studies of regurgitation rates and market impact.
- Pirate sourcing is a reputational anchor, not an automatic loss. Judge Alsup views mass piracy as a potential bad faith factor; Judge Chhabria treats it as relevant but ultimately immaterial without proof of market harm. Either way, robust provenance records remain the cheapest insurance policy.
- Factor four will decide the next wave. Kadrey demands economic modelling of substitution effects. Therefore, it appears Plaintiffs will likely pivot to showing how models exactly duplicate materials and could diminish the market (for example, the market for news, as the New York Times case is now pursuing), and expert surveys on genre displacement. On the other hand, defendants will highlight the public’s continued appetite for marquee authors and the availability of public domain training data.
- Focus on model inputs not model outputs. Neither set of plaintiffs alleged that the LLMs generated and output works to the public that were infringing, and the extent to which AI-generated works may infringe on copyrighted works, and the liability of the LLM in such acts, is still to be refined.
Action Points for AI Businesses and Enterprises Using Them
- Document the data chain—down to the URL. Maintain click‑through or purchase records for every ingest source and track subsequent filtration steps. A clean corpus will not cure infringement claims automatically, but it sharply reduces litigation leverage.
- Measure and throttle memorization. Deploy adversarial prompt audits and ngram filtering to prove that your model cannot emit protected passages longer than de minimis snippets.
- Model market impact before plaintiffs do. Commission independent economists to quantify how your product might cannibalize (or expand) sales of comparable human works—a proactive defense against factor four challenges.
- Separate reproduction from distribution. If torrenting, peer-to-peer retrieval, or third-party hosting is unavoidable, sandbox those workflows and monitor them. Both Anthropic and Meta still must defend claims built on improperly seeding training information that have now survived summary judgment.
- Anticipate a patchwork until the Ninth Circuit or Supreme Court weighs in. These opinions are only overtly controlling in the Northern District of California. Conflicting signals from the SDNY (New York Times), the MDL before Judge Stein, and pending visual-arts cases could generate a circuit split within the year.
Bottom Line
Both courts agree that teaching machines to write is, at its core, the type of knowledge‐building progress copyright law is meant to foster. They diverge on how much economic evidence is needed to ensure that progress does not undermine human creativity. For companies that can prove lawful acquisition, minimize memorization, and substantiate real-world benefits, these “twin pillars” of California fair use jurisprudence offer a solid (but not unshakeable) foundation on which to keep training towards superintelligence.
Our attorneys are dedicated to helping individuals stay informed about their intellectual property rights and navigate the complexities of an era increasingly influenced by artificial intelligence. If you have any questions related to intellectual property, please contact our team at (516) 599-0888 or submit the form below.
DISCLAIMER: This summary is not legal advice and does not create any attorney-client relationship. This summary does not provide a definitive legal opinion for any factual situation. Before the firm can provide legal advice or opinion to any person or entity, the specific facts at issue must be reviewed by the firm. Before an attorney-client relationship is formed, the firm must have a signed engagement letter with a client setting forth the Firm’s scope and terms of representation. The information contained herein is based upon the law at the time of publication.