Data Governance Essentials in the EU AI Act

Understanding Data and Data Governance in the EU AI Act

The European Union’s Artificial Intelligence Act (EU AI Act) proposes a framework to regulate AI, particularly for “high-risk” systems—those that could impact health, safety, or fundamental rights. A crucial element of this framework is Article 10, which focuses on data and data governance. This article mandates strict standards for the datasets used in training, validating, and testing high-risk AI systems to prevent issues like bias, errors, or discrimination.

Understanding Article 10 is vital for AI providers and stakeholders interested in AI regulation on data and data governance. This article conceptualizes the data and data governance requirements as outlined in the Act. We’ll explore what data governance means, its key elements, and its significance for compliance.

What is Data Governance in the Context of AI?

Data governance refers to the set of practices, policies, and processes that ensure data is handled ethically, accurately, and in line with ethical and legal standards. For high-risk AI systems, poor data practices can lead to amplified biases or unreliable outcomes, which is why the AI Act emphasizes governance to mitigate risks and ensure systems perform as intended.

Think of data governance as a conceptual framework:

  • It covers everything from how data is collected and prepared to how biases are detected and corrected.
  • The goal is to make AI systems not just functional, but also fair and compliant with regulations like the General Data Protection Regulation (GDPR) and others.
  • In Article 10, this governance applies specifically to training, validation, and testing datasets, ensuring they’re suitable for the AI’s purpose and free from flaws that could harm users.

The Five Pillars of Data Governance

Article 10 is structured around five main paragraphs, each building on the last to create a robust data management ecosystem. These pillars apply to datasets for high-risk AI systems, with some exceptions for non-training-based systems. Let’s explore each one.

1. Data Governance and Management Practices (Article 10(2))

Datasets must undergo appropriate governance and management practices tailored to the AI system’s intended purpose. It’s not a one-size-fits-all approach; practices should reflect the system’s design and real-world application. Key elements include:

  • Design Choices: Strategic decisions during development align the AI with its goals.
  • Data Collection Processes: Document the origins of data and how it was gathered to build trust.
  • Data Preparation Operations: Maintain high quality through tasks like annotation, cleaning, and updating.
  • Formulation of Assumptions: Clearly define what the data represents to avoid errors.
  • Assessment of Data Suitability: Evaluate if datasets are available and fit for purpose.
  • Bias Examination: Scrutinize data for biases that could affect fundamental rights.
  • Bias Mitigation: Implement measures to detect and correct biases.
  • Addressing Data Gaps and Shortcomings: Identify deficiencies that could hinder compliance.

2. Dataset Characteristics (Article 10(3))

Once governance practices are in place, the datasets themselves must meet quality benchmarks. They need to be:

  • Relevant and Sufficiently Representative: Mirror real-world scenarios to avoid skewed results.
  • Free of Errors and Complete: Minimize inaccuracies and missing values to ensure reliability.
  • Statistically Appropriate: Ensure the data’s statistical properties align with the target population.

3. Contextual Considerations (Article 10(4))

Data doesn’t exist in a vacuum. This paragraph requires datasets to be customized to the AI’s specific geographical, behavioral, functional, or contextual settings. The benefits include:

  • Promotes Fairness and Non-Discrimination: Representative data reduces biases that could disadvantage certain groups.
  • Enhances Accuracy and Integrity: Tailored data improves completeness and reliability.
  • Aligns with Legal Standards: Complies with GDPR principles.
  • Reduces Risks: Matches data to operational contexts, avoiding mismatches that could lead to failures.
  • Compliance Workflow: Assess the AI’s purpose, curate relevant data, and document decisions for ongoing bias mitigation.

4. Processing Special Categories of Personal Data (Article 10(4))

Special categories of personal data—such as health records, biometric info, or racial details—are highly sensitive. Providers can only process them exceptionally and only for bias detection and correction when absolutely necessary. Strict conditions must be met, including:

  • No viable alternative data exists for the task.
  • Technical limitations on reuse with privacy-preserving measures.
  • Effective access controls and full documentation.
  • Data must not be transferred or accessed by third parties.
  • Delete the data once the bias is fixed or the retention period ends.
  • Processing records must explain why special data was essential.

These safeguards protect fundamental rights while allowing limited use for critical improvements.

5. Testing Datasets for Non-Training Systems (Article 10(5))

Not all high-risk AI systems rely on machine learning models that “train” on data. For those that don’t, the full governance requirements apply only to testing datasets. This streamlines compliance without compromising quality for evaluation phases.

Why Does This Matter? The Bigger Picture

Article 10 isn’t just regulatory fine print; it’s a blueprint for compliance. By enforcing rigorous data governance, the EU AI Act helps prevent AI from perpetuating inequalities or causing unintended harm. For providers, compliance means investing in robust processes—resulting in AI that is more innovative, trustworthy, and market-ready.

If you’re building AI, start auditing your data practices against these pillars. As AI integrates deeper into society, remember: Great AI starts with great data governance.

More Insights

Revolutionizing Drone Regulations: The EU AI Act Explained

The EU AI Act represents a significant regulatory framework that aims to address the challenges posed by artificial intelligence technologies in various sectors, including the burgeoning field of...

Revolutionizing Drone Regulations: The EU AI Act Explained

The EU AI Act represents a significant regulatory framework that aims to address the challenges posed by artificial intelligence technologies in various sectors, including the burgeoning field of...

Embracing Responsible AI to Mitigate Legal Risks

Businesses must prioritize responsible AI as a frontline defense against legal, financial, and reputational risks, particularly in understanding data lineage. Ignoring these responsibilities could...

AI Governance: Addressing the Shadow IT Challenge

AI tools are rapidly transforming workplace operations, but much of their adoption is happening without proper oversight, leading to the rise of shadow AI as a security concern. Organizations need to...

EU Delays AI Act Implementation to 2027 Amid Industry Pressure

The EU plans to delay the enforcement of high-risk duties in the AI Act until late 2027, allowing companies more time to comply with the regulations. However, this move has drawn criticism from rights...

White House Challenges GAIN AI Act Amid Nvidia Export Controversy

The White House is pushing back against the bipartisan GAIN AI Act, which aims to prioritize U.S. companies in acquiring advanced AI chips. This resistance reflects a strategic decision to maintain...

Experts Warn of EU AI Act’s Impact on Medtech Innovation

Experts at the 2025 European Digital Technology and Software conference expressed concerns that the EU AI Act could hinder the launch of new medtech products in the European market. They emphasized...

Ethical AI: Transforming Compliance into Innovation

Enterprises are racing to innovate with artificial intelligence, often without the proper compliance measures in place. By embedding privacy and ethics into the development lifecycle, organizations...

AI Hiring Compliance Risks Uncovered

Artificial intelligence is reshaping recruitment, with the percentage of HR leaders using generative AI increasing from 19% to 61% between 2023 and 2025. However, this efficiency comes with legal...