TLDR: The U.S. report on copyright and generative AI
The most recent report from the U.S. Copyright Office, titled “Copyright and Artificial Intelligence, Part 3: Generative AI Training”, was published in a preliminary version on May 9, 2025. This document analyzes the use of copyrighted works in the training of generative artificial intelligence models.
The report highlights that the massive use of protected content to train AI models for commercial purposes may exceed the limits of “fair use”, especially if this data is accessed without a license. In addition, it suggests that generating content that directly competes with the original works may not be covered by this doctrine. The development of licensing mechanisms and greater transparency in the use of protected works to train AI models is recommended.
This report is the third part of a series on copyright and generative AI: Part 1 (July 2024) deals with digital replicas, and Part 2 (January 2025) on the protection of works generated by AI.
Generative AI Copyright: Before vs. Now
Aspect
Before
Now
Use of protected works
It was assumed that the use of copyrighted material was protected by fair use.
The report questions this idea, especially for commercial purposes.
Transformativity
It was considered that training AI was always transformative.
It depends on the purpose and whether the use is truly different from the original.
Market impact
Little focus on the effect on the original works.
It is evaluated whether the AI replaces or negatively affects the works.
Licenses
Unlicensed practices were common.
The need for license agreements is promoted.
Transparency
There were no clear requirements.
Disclosure of the data sources used is encouraged.
Key news from the report
- Review of “fair use”: It questions whether the use of protected content to train models for commercial purposes can be fair use.
- Transformativity not guaranteed: The use must be clearly different and with a different purpose.
- Evaluation of market damage: If the AI competes with or replaces the original content, the use could be illegal.
- Promotion of licenses: Companies should consider formal agreements with rights holders.
- Data transparency: The report encourages the publication of the data sources used for training.
For developers and tech teams
- Review your datasets: Make sure they do not infringe rights.
- Evaluate legal risks: Analyze whether your model generates outputs that compete with original works.
- Promote traceability: Document the training sources.
- Explore licenses: Consider agreements with content owners.
How to incorporate copyright and generative AI news
This report marks a before and after for the development of generative AI models. If you are part of a technical team or lead strategic decisions about data and training, it is key to start reviewing internal processes and policies. And if you need specialized guidance, at My Tech Plan we can help you find an expert professional to help your team find the safest and most legally sound approach to move forward with a firm step into the future.