EU AI Act Guidelines Draft Hones Copyright Specifications
The EU AI Office has released a new draft on the Code of Practice (CoP) concerning general-purpose model provider obligations under the EU AI Act, dated March 11. This latest iteration has been deemed more workable than its predecessors and must be finalized by May 2. This timeline leaves GPAI model providers with only three months to comply before their obligations come into effect on August 2.
The stakes for non-compliance are significant, with potential fines reaching up to 3% of the provider’s annual global turnover or 15 million euros (approximately $16.2 million), whichever is higher. In severe cases, non-compliance could even result in an EU ban on the model.
Over 1,000 stakeholders have collaborated with the EU AI Office on the CoP, which serves as a detailed guide for GPAI model providers to meet their obligations under the act. Although adherence to the CoP is voluntary, it serves as a demonstration of compliance until harmonized standards are established. Providers may alternatively pursue other means of compliance, subject to assessment by the European Commission.
Practice and Compliance
The CoP introduces copyright-related obligations that are designed to be proportionate and commensurate with the size and capacity of individual providers. Key specifications include:
- Providers must establish internal compliance responsibilities and document all copyright-relevant commitments in a single summary.
- Efforts should be made to mitigate the risk of downstream AI systems generating copyright-infringing outputs.
- Models should not memorize training content that could lead to copyright infringements, with prohibitions against infringing uses outlined in the model’s terms and conditions.
Notably, the earlier CoP draft stated that the provider’s obligations were limited to modification and fine-tuning actions, a point that has since been removed from the text but remains referenced in the AI Office’s FAQs.
Opt-Outs and Training
Providers are required to identify and comply with machine-readable rights reservations or opt-outs from rightsholders regarding content used for text and data mining (TDM). This obligation extends to AI training conducted both inside and outside the EU. Specific requirements include:
- Providers must not circumvent effective technological measures, such as paywalls, when crawling the web.
- Efforts must be made to exclude “piracy domains” from web crawling activities.
- For third-party datasets, providers must ascertain compliance with rights reservations, particularly those outlined in the Robot Exclusion Protocol (robots.txt).
Regarding the recognition of machine-readable opt-outs, the CoP emphasizes using web crawlers that adhere to robots.txt instructions, aligning with the consensus of legal scholars and technical experts. Additionally, providers should make “best efforts” to comply with other machine-readable protocols expressing opt-outs.
Takeaways
While the CoP does not eliminate all legal and technical uncertainties for GPAI model compliance, it establishes a structured compliance pathway. The clarification regarding robots.txt as the standard for TDM opt-outs significantly reduces ambiguity. However, a key challenge remains in defining what model providers must do to prevent copyright-infringing outputs from downstream systems.
The drafting process revisions reflect stakeholder input, indicating that practical considerations are being acknowledged. Stakeholders are invited to submit final feedback by March 30, marking a crucial opportunity for input as the final discussions approach.