Data Governance Essentials in the EU AI Act

Understanding Data and Data Governance in the EU AI Act

The European Union’s Artificial Intelligence Act (EU AI Act) proposes a framework to regulate AI, particularly for “high-risk” systems—those that could impact health, safety, or fundamental rights. A crucial element of this framework is Article 10, which focuses on data and data governance. This article mandates strict standards for the datasets used in training, validating, and testing high-risk AI systems to prevent issues like bias, errors, or discrimination.

Understanding Article 10 is vital for AI providers and stakeholders interested in AI regulation on data and data governance. This article conceptualizes the data and data governance requirements as outlined in the Act. We’ll explore what data governance means, its key elements, and its significance for compliance.

What is Data Governance in the Context of AI?

Data governance refers to the set of practices, policies, and processes that ensure data is handled ethically, accurately, and in line with ethical and legal standards. For high-risk AI systems, poor data practices can lead to amplified biases or unreliable outcomes, which is why the AI Act emphasizes governance to mitigate risks and ensure systems perform as intended.

Think of data governance as a conceptual framework:

  • It covers everything from how data is collected and prepared to how biases are detected and corrected.
  • The goal is to make AI systems not just functional, but also fair and compliant with regulations like the General Data Protection Regulation (GDPR) and others.
  • In Article 10, this governance applies specifically to training, validation, and testing datasets, ensuring they’re suitable for the AI’s purpose and free from flaws that could harm users.

The Five Pillars of Data Governance

Article 10 is structured around five main paragraphs, each building on the last to create a robust data management ecosystem. These pillars apply to datasets for high-risk AI systems, with some exceptions for non-training-based systems. Let’s explore each one.

1. Data Governance and Management Practices (Article 10(2))

Datasets must undergo appropriate governance and management practices tailored to the AI system’s intended purpose. It’s not a one-size-fits-all approach; practices should reflect the system’s design and real-world application. Key elements include:

  • Design Choices: Strategic decisions during development align the AI with its goals.
  • Data Collection Processes: Document the origins of data and how it was gathered to build trust.
  • Data Preparation Operations: Maintain high quality through tasks like annotation, cleaning, and updating.
  • Formulation of Assumptions: Clearly define what the data represents to avoid errors.
  • Assessment of Data Suitability: Evaluate if datasets are available and fit for purpose.
  • Bias Examination: Scrutinize data for biases that could affect fundamental rights.
  • Bias Mitigation: Implement measures to detect and correct biases.
  • Addressing Data Gaps and Shortcomings: Identify deficiencies that could hinder compliance.

2. Dataset Characteristics (Article 10(3))

Once governance practices are in place, the datasets themselves must meet quality benchmarks. They need to be:

  • Relevant and Sufficiently Representative: Mirror real-world scenarios to avoid skewed results.
  • Free of Errors and Complete: Minimize inaccuracies and missing values to ensure reliability.
  • Statistically Appropriate: Ensure the data’s statistical properties align with the target population.

3. Contextual Considerations (Article 10(4))

Data doesn’t exist in a vacuum. This paragraph requires datasets to be customized to the AI’s specific geographical, behavioral, functional, or contextual settings. The benefits include:

  • Promotes Fairness and Non-Discrimination: Representative data reduces biases that could disadvantage certain groups.
  • Enhances Accuracy and Integrity: Tailored data improves completeness and reliability.
  • Aligns with Legal Standards: Complies with GDPR principles.
  • Reduces Risks: Matches data to operational contexts, avoiding mismatches that could lead to failures.
  • Compliance Workflow: Assess the AI’s purpose, curate relevant data, and document decisions for ongoing bias mitigation.

4. Processing Special Categories of Personal Data (Article 10(4))

Special categories of personal data—such as health records, biometric info, or racial details—are highly sensitive. Providers can only process them exceptionally and only for bias detection and correction when absolutely necessary. Strict conditions must be met, including:

  • No viable alternative data exists for the task.
  • Technical limitations on reuse with privacy-preserving measures.
  • Effective access controls and full documentation.
  • Data must not be transferred or accessed by third parties.
  • Delete the data once the bias is fixed or the retention period ends.
  • Processing records must explain why special data was essential.

These safeguards protect fundamental rights while allowing limited use for critical improvements.

5. Testing Datasets for Non-Training Systems (Article 10(5))

Not all high-risk AI systems rely on machine learning models that “train” on data. For those that don’t, the full governance requirements apply only to testing datasets. This streamlines compliance without compromising quality for evaluation phases.

Why Does This Matter? The Bigger Picture

Article 10 isn’t just regulatory fine print; it’s a blueprint for compliance. By enforcing rigorous data governance, the EU AI Act helps prevent AI from perpetuating inequalities or causing unintended harm. For providers, compliance means investing in robust processes—resulting in AI that is more innovative, trustworthy, and market-ready.

If you’re building AI, start auditing your data practices against these pillars. As AI integrates deeper into society, remember: Great AI starts with great data governance.

More Insights

EU AI Act vs. US AI Action Plan: A Risk Perspective

Dr. Cari Miller discusses the differences between the EU AI Act and the US AI Action Plan, highlighting that the EU framework is much more risk-aware and imposes binding obligations on high-risk AI...

The Hidden Risks of AI Integration in the Workplace

As organizations rush to adopt AI, many are ignoring the critical risks involved, such as compliance and oversight issues. Without proper governance and human management, AI can quickly become a...

Investing in AI Safety: Capitalizing on the Future of Responsible Innovation

The AI safety collaboration imperative is becoming essential as the artificial intelligence revolution reshapes industries and daily life. Investors are encouraged to capitalize on this opportunity by...

AI Innovations in Modern Policing

Law enforcement agencies are increasingly leveraging artificial intelligence to enhance their operations, particularly in predictive policing. The integration of technology offers immense potential...

Kenya’s Pivotal Role in UN’s Groundbreaking AI Governance Agreement

Kenya has achieved a significant diplomatic success by leading the establishment of two landmark institutions for governing artificial intelligence (AI) at the United Nations. The Independent...

AI Governance Framework: Ensuring Responsible Deployment for a Safer Future

At the 17th annual conference of ISACA in Abuja, stakeholders called for an AI governance framework to ensure responsible deployment of artificial intelligence. They emphasized the need for...

Essential Strategies for Effective AI Governance in Healthcare

The AMA emphasizes the necessity for CMOs and healthcare leaders to establish policies for AI tool adoption and governance due to the rapid expansion of AI in healthcare. Key foundational elements for...

UN Establishes AI Governance Panel for Global Cooperation

The United Nations General Assembly has adopted a resolution to establish an Independent International Scientific Panel on Artificial Intelligence and a Global Dialogue on AI Governance. This...

Emerging Cyber Threats: AI Risks and Solutions for Brokers

As artificial intelligence (AI) tools rapidly spread across industries, they present new cyber risks alongside their benefits. Brokers are advised to help clients navigate these risks by understanding...