Data Governance Essentials in the EU AI Act

Understanding Data and Data Governance in the EU AI Act

The European Union’s Artificial Intelligence Act (EU AI Act) proposes a framework to regulate AI, particularly for “high-risk” systems—those that could impact health, safety, or fundamental rights. A crucial element of this framework is Article 10, which focuses on data and data governance. This article mandates strict standards for the datasets used in training, validating, and testing high-risk AI systems to prevent issues like bias, errors, or discrimination.

Understanding Article 10 is vital for AI providers and stakeholders interested in AI regulation on data and data governance. This article conceptualizes the data and data governance requirements as outlined in the Act. We’ll explore what data governance means, its key elements, and its significance for compliance.

What is Data Governance in the Context of AI?

Data governance refers to the set of practices, policies, and processes that ensure data is handled ethically, accurately, and in line with ethical and legal standards. For high-risk AI systems, poor data practices can lead to amplified biases or unreliable outcomes, which is why the AI Act emphasizes governance to mitigate risks and ensure systems perform as intended.

Think of data governance as a conceptual framework:

  • It covers everything from how data is collected and prepared to how biases are detected and corrected.
  • The goal is to make AI systems not just functional, but also fair and compliant with regulations like the General Data Protection Regulation (GDPR) and others.
  • In Article 10, this governance applies specifically to training, validation, and testing datasets, ensuring they’re suitable for the AI’s purpose and free from flaws that could harm users.

The Five Pillars of Data Governance

Article 10 is structured around five main paragraphs, each building on the last to create a robust data management ecosystem. These pillars apply to datasets for high-risk AI systems, with some exceptions for non-training-based systems. Let’s explore each one.

1. Data Governance and Management Practices (Article 10(2))

Datasets must undergo appropriate governance and management practices tailored to the AI system’s intended purpose. It’s not a one-size-fits-all approach; practices should reflect the system’s design and real-world application. Key elements include:

  • Design Choices: Strategic decisions during development align the AI with its goals.
  • Data Collection Processes: Document the origins of data and how it was gathered to build trust.
  • Data Preparation Operations: Maintain high quality through tasks like annotation, cleaning, and updating.
  • Formulation of Assumptions: Clearly define what the data represents to avoid errors.
  • Assessment of Data Suitability: Evaluate if datasets are available and fit for purpose.
  • Bias Examination: Scrutinize data for biases that could affect fundamental rights.
  • Bias Mitigation: Implement measures to detect and correct biases.
  • Addressing Data Gaps and Shortcomings: Identify deficiencies that could hinder compliance.

2. Dataset Characteristics (Article 10(3))

Once governance practices are in place, the datasets themselves must meet quality benchmarks. They need to be:

  • Relevant and Sufficiently Representative: Mirror real-world scenarios to avoid skewed results.
  • Free of Errors and Complete: Minimize inaccuracies and missing values to ensure reliability.
  • Statistically Appropriate: Ensure the data’s statistical properties align with the target population.

3. Contextual Considerations (Article 10(4))

Data doesn’t exist in a vacuum. This paragraph requires datasets to be customized to the AI’s specific geographical, behavioral, functional, or contextual settings. The benefits include:

  • Promotes Fairness and Non-Discrimination: Representative data reduces biases that could disadvantage certain groups.
  • Enhances Accuracy and Integrity: Tailored data improves completeness and reliability.
  • Aligns with Legal Standards: Complies with GDPR principles.
  • Reduces Risks: Matches data to operational contexts, avoiding mismatches that could lead to failures.
  • Compliance Workflow: Assess the AI’s purpose, curate relevant data, and document decisions for ongoing bias mitigation.

4. Processing Special Categories of Personal Data (Article 10(4))

Special categories of personal data—such as health records, biometric info, or racial details—are highly sensitive. Providers can only process them exceptionally and only for bias detection and correction when absolutely necessary. Strict conditions must be met, including:

  • No viable alternative data exists for the task.
  • Technical limitations on reuse with privacy-preserving measures.
  • Effective access controls and full documentation.
  • Data must not be transferred or accessed by third parties.
  • Delete the data once the bias is fixed or the retention period ends.
  • Processing records must explain why special data was essential.

These safeguards protect fundamental rights while allowing limited use for critical improvements.

5. Testing Datasets for Non-Training Systems (Article 10(5))

Not all high-risk AI systems rely on machine learning models that “train” on data. For those that don’t, the full governance requirements apply only to testing datasets. This streamlines compliance without compromising quality for evaluation phases.

Why Does This Matter? The Bigger Picture

Article 10 isn’t just regulatory fine print; it’s a blueprint for compliance. By enforcing rigorous data governance, the EU AI Act helps prevent AI from perpetuating inequalities or causing unintended harm. For providers, compliance means investing in robust processes—resulting in AI that is more innovative, trustworthy, and market-ready.

If you’re building AI, start auditing your data practices against these pillars. As AI integrates deeper into society, remember: Great AI starts with great data governance.

More Insights

AI Regulations: Comparing the EU’s AI Act with Australia’s Approach

Global companies need to navigate the differing AI regulations in the European Union and Australia, with the EU's AI Act setting stringent requirements based on risk levels, while Australia adopts a...

Quebec’s New AI Guidelines for Higher Education

Quebec has released its AI policy for universities and Cégeps, outlining guidelines for the responsible use of generative AI in higher education. The policy aims to address ethical considerations and...

AI Literacy: The Compliance Imperative for Businesses

As AI adoption accelerates, regulatory expectations are rising, particularly with the EU's AI Act, which mandates that all staff must be AI literate. This article emphasizes the importance of...

Germany’s Approach to Implementing the AI Act

Germany is moving forward with the implementation of the EU AI Act, designating the Federal Network Agency (BNetzA) as the central authority for monitoring compliance and promoting innovation. The...

Global Call for AI Safety Standards by 2026

World leaders and AI pioneers are calling on the United Nations to implement binding global safeguards for artificial intelligence by 2026. This initiative aims to address the growing concerns...

Governance in the Era of AI and Zero Trust

In 2025, AI has transitioned from mere buzz to practical application across various industries, highlighting the urgent need for a robust governance framework aligned with the zero trust economy...

AI Governance Shift: From Regulation to Technical Secretariat

The upcoming governance framework on artificial intelligence in India may introduce a "technical secretariat" to coordinate AI policies across government departments, moving away from the previous...

AI Safety as a Catalyst for Innovation in Global Majority Nations

The commentary discusses the tension between regulating AI for safety and promoting innovation, emphasizing that investments in AI safety and security can foster sustainable development in Global...

ASEAN’s AI Governance: Charting a Distinct Path

ASEAN's approach to AI governance is characterized by a consensus-driven, voluntary, and principles-based framework that allows member states to navigate their unique challenges and capacities...