Revised Guidelines for Copyright Compliance in AI Models

GPAI Code of Practice – Third Draft: Changes to the Requirements for Copyright Compliance

Created Date: March 17, 2025 1:47 PM

Background

Under the European AI Act (Regulation (EU) 2024/1689, “AI Act”), providers of General-Purpose AI (“GPAI”) models, such as models of the GPT family, Llama, or Gemini, must comply with certain requirements, including documentation and policy implementation to adhere to EU copyright law.

To facilitate compliance with these requirements, the AI Act anticipates the creation of Codes of Practice for the use of GPAI models. In response to an invitation from the AI Office, various experts and stakeholders established four working groups to draft the initial Code of Practice. Should the EU Commission approve this Code of Practice, it will acquire “general validity” within the EU. By adopting the approved GPAI Code of Practice, companies can demonstrate proactive compliance, potentially avoiding regulatory scrutiny and penalties.

The AI Office has published the working groups’ third draft of the Code of Practice (“3rd Draft”), covering the following topics:

  • Commitments
  • Transparency
  • Copyright
  • Safety and Security

The final version of the Code of Practice is scheduled for May 2, 2025.

Who is this Relevant For?

The Code of Practice is primarily relevant for providers of GPAI models. GPAI models exhibit significant generality and can competently perform a wide range of distinct tasks. This includes providers of well-known large language models such as GPT (OpenAI), Llama (Meta), Gemini (Google), or Mistral (Mistral AI). Smaller model providers may also be affected as long as their models can be utilized for a broader range of tasks. Additionally, businesses that fine-tune models for their own purposes may become GPAI model providers.

Furthermore, “downstream providers,” i.e., businesses that implement GPAI models into their AI systems, should familiarize themselves with the Code of Practice. This Code may evolve into a quasi-standard for GPAI models, influencing what AI system developers can expect from a GPAI model, which is crucial when negotiating contracts with GPAI model providers.

Key Concepts of the Code of Practice on Copyright Law

Providers of GPAI models are mandated to establish a policy to comply with EU copyright law (Art. 53 (1) (c) AI Act). Given the absence of prior similar requirements, practical guidance on what such a policy should entail is currently lacking. The Code of Practice aims to bridge this gap.

The Code of Practice requires providers to implement the following measures:

Copyright Policy

Providers signing the Code of Practice (“Signatories”) must draft, maintain, and enforce a copyright policy that ensures compliance with EU copyright law. This requirement is directly stipulated under the AI Act. Signatories are also responsible for ensuring adherence to the copyright policy within their organization.

An important change under the 3rd Draft, compared to the 2nd Draft, is that Signatories are no longer obligated to publish their copyright policy but are merely encouraged to do so. This adjustment aligns with the AI Act, which does not mandate model providers to publish their copyright policy.

Web Crawling of Copyrighted Content

Signatories are generally permitted to utilize web crawlers for text and data mining (“TDM”) to gather training data for their GPAI models. However, they must ensure that crawlers respect technologies that restrict access to copyrighted materials, such as paywalls.

Moreover, Signatories are required to exclude “piracy domains,” which are internet sources that profit from providing copyright-infringing materials.

Web Crawling and Identifying and Complying with TDM Opt-Outs

Signatories must ensure that web crawlers identify and adhere to a TDM opt-out declared by rightsholders. While TDM is generally permitted under EU copyright law, rightsholders have the option to declare an opt-out. For web content, this opt-out must be machine-readable. The 3rd Draft explicitly outlines that web crawlers should comply with the widely used robots.txt protocol, in addition to other relevant machine-readable TDM opt-outs established as industry standards.

Signatories are expected to take reasonable measures to inform rightsholders about the web crawlers in use and how these crawlers interact with robots.txt files. Information can be disseminated via web feeds. Notably, the 3rd Draft no longer includes an obligation to publish this information.

Identifying and Complying with a TDM Opt-Out for Non-Web-Crawled Content

GPAI model providers may also source datasets from third parties instead of applying web crawling themselves. While the 2nd Draft required a copyright due diligence for third-party datasets, the 3rd Draft necessitates reasonable efforts to ascertain whether web crawlers used to gather information complied with robots.txt protocols.

Mitigating Risk to Prevent the Production of Copyright-Infringing Output

One risk associated with AI usage is the generation of output that infringes copyrights, such as duplicating code or images that are copyright protected. Signatories must make reasonable efforts to mitigate this risk, which is a welcome adjustment compared to the 2nd Draft that mandated measures to avoid “overfitting.” The current draft is more technically neutral, emphasizing reasonable efforts.

Additionally, Signatories must include a clause in their terms and conditions (or similar documents) to prohibit copyright-infringing uses of their GPAI model by providers of downstream AI systems.

Designating a Point of Contact

Signatories are required to provide a point of contact for rightsholders and establish a mechanism to allow them to submit complaints regarding copyright infringements. Under the 3rd Draft, Signatories may decline to process complaints deemed unfounded or excessive.

Conclusion and Recommendations for Businesses

The 3rd Draft introduces reasonable changes compared to the 2nd Draft, facilitating compliance with the Code of Practice. This is likely to enhance the practicality of using the Code of Practice for compliance with the AI Act.

However, it is crucial to recognize that the Code of Practice remains a draft and may undergo substantial changes. The approval of the final Code of Practice by the EU Commission is likely but not guaranteed.

The working groups will accept feedback from stakeholders until March 30, 2025, and aim to present a final version in May 2025.

More Insights

AI Governance: Essential Insights for Tech and Security Professionals

Artificial intelligence (AI) is significantly impacting various business domains, including cybersecurity, with many organizations adopting generative AI for security purposes. As AI governance...

Government Under Fire for Rapid Facial Recognition Adoption

The UK government has faced criticism for the rapid rollout of facial recognition technology without establishing a comprehensive legal framework. Concerns have been raised about privacy...

AI Governance Start-Ups Surge Amid Growing Demand for Ethical Solutions

As the demand for AI technologies surges, so does the need for governance solutions to ensure they operate ethically and securely. The global AI governance industry is projected to grow significantly...

10-Year Ban on State AI Laws: Implications and Insights

The US House of Representatives has approved a budget package that includes a 10-year moratorium on enforcing state AI laws, which has sparked varying opinions among experts. Many argue that this...

AI in the Courts: Insights from 500 Cases

Courts around the world are already regulating artificial intelligence (AI) through various disputes involving automated decisions and data processing. The AI on Trial project highlights 500 cases...

Bridging the Gap in Responsible AI Implementation

Responsible AI is becoming a critical business necessity, especially as companies in the Asia-Pacific region face rising risks associated with emergent AI technologies. While nearly half of APAC...

Leading AI Governance: The Legal Imperative for Safe Innovation

In a recent interview, Brooke Johnson, Chief Legal Counsel at Ivanti, emphasizes the critical role of legal teams in AI governance, advocating for cross-functional collaboration to ensure safe and...

AI Regulations: Balancing Innovation and Safety

The recent passage of the One Big Beautiful Bill Act by the House of Representatives includes a provision that would prevent states from regulating artificial intelligence for ten years. This has...

Balancing Compliance and Innovation in Financial Services

Financial services companies face challenges in navigating rapidly evolving AI regulations that differ by jurisdiction, which can hinder innovation. The need for compliance is critical, as any misstep...