The European Union is Still Caught in an AI Copyright Bind
The European Union (EU) is navigating a complex landscape of copyright law and artificial intelligence (AI) regulations as it implements the Artificial Intelligence Act (Regulation (EU) 2024/1689). The recent publication of the Code of Practice (CoP) aims to address these challenges while maintaining growth momentum in the AI sector.
Implementation of the AI Act
In July 2025, the European Commission issued guidelines, including the CoP, that cover essential areas such as transparency in AI model construction, safety and security measures, and compliance with copyright regulations concerning AI training data. While the first two aspects are relatively straightforward, the copyright component poses significant challenges.
The Challenges of Copyright Compliance
For frontier AI models, enhanced training data correlates with improved performance. However, existing copyright obligations limit the available data and increase costs through licensing requirements. This situation not only impacts the EU’s competitiveness in the global AI market but also complicates compliance with the CoP’s copyright provisions.
Some provisions are reasonable, such as prohibiting the reproduction of copyright-protected content in model outputs and restricting training to lawfully accessible data. However, other requirements, such as ensuring transparency about dataset origins, create additional hurdles, especially as more copyright holders opt-out of allowing their data for training purposes.
The Illusion of Full Transparency
The CoP assumes that improved transparency will facilitate licensing negotiations, providing copyright holders a fair share of AI revenues. However, this assumption overlooks the high transaction costs involved in negotiating with numerous rightsholders, especially for smaller web publishers. The model used for online advertising is not directly applicable to AI licensing, and collective licensing does not resolve the issue of tracking content use effectively.
Furthermore, the CoP specifies that developers only need to provide a summary list of relevant online domains from which training data was sourced. This leaves out small publishers, which may lead to biased training datasets that underrepresent smaller language and cultural communities.
Stretching the CoP Beyond Copyright Law
The CoP casts a wide net in defining AI training data, encompassing all data used for pre-training, fine-tuning, and reinforcement learning, regardless of intellectual property rights. This includes personal data, synthetic data, and data extracted from other AI models. As a result, AI developers may opt for secrecy regarding synthetic training data, as it has become a competitive advantage.
Moreover, the CoP’s requirement for AI developers to ensure compliance with data mining opt-outs without negatively impacting the discoverability of opted-out content by search engines introduces further complications. This distinction between machine learning and human learning diminishes the quality of AI-generated responses and increases the cognitive burden on human users.
The Global AI Landscape
The EU’s AI copyright regime faces pressure from countries with less restrictive copyright laws, such as Japan and the United States, which are fostering more robust AI innovation. The US AI Action Plan emphasizes removing regulatory barriers to achieve global dominance in the AI field. Should US courts grant exceptions for AI training data based on transformative use, the EU may need to realign its copyright laws or risk losing its foothold in AI development.
Policy Conclusions
The EU’s subtle weakening of copyright enforcement through the CoP has enabled most major AI developers to participate, although this approach may only be a temporary solution. A more effective policy would require a comprehensive debate on the role of AI in fostering learning, research, and innovation, and the conditions under which AI can access data.
In the current landscape, copyright law appears to hinder rather than promote innovation. A shift towards a more flexible copyright framework, inspired by certain elements of patent law, could facilitate better access to data for AI model training and post-training data retrieval. This would allow AI models to learn from all legally accessible content while still protecting the rights of original creators.