eazyware
Opinion·April 10, 2023·12 min read

AI copyright questions in 2026

Training data, output ownership, DMCA, fair use. The copyright questions affecting AI companies and how they're being resolved.

KR
Kushal R.
Engineering lead

AI and copyright questions remain actively contested in 2026. Training data legality, output copyrightability, infringing outputs — all areas where law is being made case by case. This post is where things stand: the major cases, the emerging doctrines, and the practical implications for businesses deploying AI.

Three axes
AI copyright — active questions Training data Fair use arguments NYT v. OpenAI Authors v. Meta, Anthropic Output copyrightability US Copyright Office guidance Human authorship required AI-assisted works nuanced Infringing output Memorization cases Style replication User vs model liability Where things stand in 2026 Training data cases slowly producing rulings; no clear doctrine yet Some licensing deals (OpenAI with publishers); mixed industry response Settlements and consent decrees shaping practice more than court rulings
Training data: fair use, NYT v. OpenAI, authors v. Meta/Anthropic. Output copyrightability: human authorship required. Infringing output: memorization, style, liability.

Training data questions

Fair use arguments. AI labs argue training is transformative use; constitutes fair use under US law. Analogous to search engines indexing.

Plaintiff arguments. Training on copyrighted material without license is infringement. Market substitution; right to control derivative works.

Major cases. NYT v. OpenAI/Microsoft (ongoing); Authors Guild v. OpenAI; Getty v. Stability; Concord Music et al v. Anthropic; UMG et al v. Suno/Udio.

Status. Mixed outcomes in preliminary rulings. Some favorable to AI companies; some favorable to plaintiffs. No controlling doctrine yet.

Licensing deals emerging. OpenAI's deals with major publishers (AP, Financial Times, News Corp). Google similar. Suggests market emerging regardless of legal status.

Output copyrightability

US Copyright Office guidance (2023, refined 2024). Human authorship required for copyright. AI-generated content alone not copyrightable.

AI-assisted works. Nuanced. Substantial human creative contribution required. Significant selection, arrangement, or direct authorship.

Registration practice. Applicants must disclose AI involvement; subject to review.

Practical effect. AI-generated images, text, music used in commerce — can use but can't register copyright.

International variation. Japan, EU have varying approaches. No uniform global standard.

Infringing output

Memorization. Models can reproduce training data verbatim. NYT case includes examples of GPT-4 producing near-verbatim articles.

Style replication. Models producing content 'in the style of' artists. More contested; styles not typically copyrightable.

User vs model liability. Who's liable when AI produces infringing output? User who prompts? Company that built model?

Emerging doctrine. Safe harbors for AI providers conditional on reasonable safeguards. Analogous to DMCA safe harbors for hosts.

Where things stand in 2026

Training cases slowly producing rulings. No clear doctrine yet; circuits split possible.

Licensing deals accelerating. Many AI companies paying for training data; effectively creating licensing market.

Settlements shaping practice. More than court rulings. Opt-out mechanisms, data provenance, licensing fees becoming norms.

Opt-outs common. robots.txt extensions (ai.txt); HTTP headers; content creator opt-out mechanisms.

Watermarking emerging. C2PA standard for content provenance; mandated in some jurisdictions.

Business implications

AI providers. Indemnification clauses increasingly offered for enterprise customers. Reduces user risk.

Enterprise customers. Due diligence on AI providers' training data sources. Some sectors (legal, medical) require clean-data models.

Content creators. Opt-out mechanisms becoming norm. Licensing markets emerging.

End users. Using AI output commercially requires awareness of risks. Low-risk for most use cases; higher risk for specific types.

Specific content types

Code. GitHub Copilot cases; mostly favorable to AI. Coding standards and patterns have limited copyright protection.

Images. Stability, Midjourney, Getty suit. Style replication less protected than literal reproduction.

Music. Suno, Udio cases pending. RIAA active. Particular scrutiny due to industry's strong enforcement history.

Text. NYT, Authors Guild, others. High-stakes due to many individual copyright holders.

Video. Less litigation so far; emerging as video AI matures.

International variations

Japan. Permissive to AI training on copyrighted material (with limits).

EU. AI Act requires copyright compliance; opt-out mechanism required for training.

UK. Consultation ongoing; currently no explicit AI training exception.

China. Unique regulatory framework; state-favored approach.

Practical guidance

For developers deploying AI. Use providers with robust legal position and indemnification. Document use; monitor for infringing output.

For content creators. Register copyright on human-created works promptly. Use opt-out mechanisms where available. Watch for licensing opportunities.

For enterprises. Policy on AI use; disclosure requirements; human review of AI output for commercial uses.

For individuals. Generally low-risk for personal use; higher for commercial. Disclosure becoming expected norm.

Read next
AI for music production: the 2026 landscape
Read next
Publishing AI: newsroom tools, archives, and subscriber retention
Read next
AI safety research today: what's happening, what matters
Tags
copyrightlegalIP
/ Next step

Want to talk about this?

We love debating this stuff. 30-minute call, no pitch, just engineering conversation.

~4h
avg response
Q2 '26
next slot
100%
NDA on request