Key GDPR Considerations for Using Patient Data in AI-Driven Research

Re-use of Patient Data in Scientific Research to Train AI Systems: Important GDPR Considerations

The integration of artificial intelligence (AI) into the realm of scientific research has opened up new avenues for innovation, particularly in the identification of potential new medicines for clinical trials. However, the use of patient data to train these AI systems raises significant legal and ethical questions, particularly in light of the General Data Protection Regulation (GDPR).

Understanding the Applicability of GDPR

In instances where an organization develops its AI system internally without utilizing patient data, the AI Act does not apply, allowing for a research exemption. This exemption arises because no personal data is processed in such scenarios. However, the landscape shifts dramatically when patient data is involved. Under these circumstances, the GDPR becomes applicable, requiring organizations to navigate a complex web of regulations.

Patient data can be sourced from various avenues, including:

  • Healthcare records collected by institutions, such as the NHS
  • Voluntary registries where patients consent to share their data for research
  • Data obtained from prior clinical trials

It is crucial to note that these datasets were originally collected for specific purposes, necessitating a thorough assessment to determine their compatibility for repurposing in AI training.

Assessing Compatibility for Secondary Use of Patient Data

The GDPR enshrines a purpose limitation principle, mandating that personal data must be collected for a specific, explicit, and legitimate purpose. If the data is to be reused for a different objective, a compatibility assessment is required. Several factors influence this assessment:

  • Link between the initial and secondary purposes: The closer the connection, the more likely it is to be deemed compatible.
  • The context and the reasonable expectations of data subjects: If data subjects were informed at the time of collection about potential secondary uses, reuse is more likely to be justified.
  • Nature of the data: The more sensitive the personal data (e.g., health data), the narrower the scope for compatibility.
  • Consequences for data subjects: Both positive and negative consequences must be evaluated.
  • Existence of appropriate safeguards: Measures like encryption, pseudonymization, transparency, and opt-out options should be considered.

The purpose limitation principle aims to maintain individuals’ control over their data and prevent unauthorized repurposing. Scientific research is generally considered a compatible secondary use, provided that appropriate safeguards are in place.

Scientific Research and GDPR Compliance

While the GDPR does not explicitly define ‘scientific research’, Recital 159 suggests a broad interpretation, encompassing technological development, fundamental and applied research, and both privately and publicly funded studies. The European Data Protection Board (EDPB) advises that scientific research must adhere to established ethical and methodological standards. Both the discovery phase and clinical research typically follow strict methods or protocols and thus should qualify as scientific research.

Despite plans for the EDPB to issue guidance in 2021 on defining scientific research and appropriate safeguards, such clarity has not yet materialized. Therefore, organizations should not assume automatic compatibility but instead conduct a thorough compatibility assessment.

Compatibility Assessment Outcomes

If the secondary use is found to be incompatible with the initial collection, the data cannot be reused for the secondary purpose unless:

  • Such processing is based on the explicit consent of the data subject.
  • A Union or Member State law safeguards important objectives of general public interest (e.g., public health).

If the secondary use is compatible with the initial collection, organizations may rely on the legal basis used for the original data collection. Nonetheless, all other data protection principles must still be respected, including informing data subjects about further processing and their rights, as well as conducting a data processing impact assessment if necessary. The compatibility assessment and adopted measures must be documented to fulfill the accountability principle.

Exceptions Under Article 9 of the GDPR for Processing Health Data

Even when a secondary use is considered compatible, an additional exception is required under Article 9 of the GDPR for processing health data. Potential exceptions include:

  • Explicit consent (art. 9(2)(a)).
  • Reasons of public interest in the area of public health based on Union or Member State law (art. 9(2)(i)).
  • Scientific research based on Union or Member State law with the adoption of appropriate safeguards (art. 9(2)(j)).

Conclusion

The use of patient data for developing or training AI systems in scientific research necessitates careful regulatory consideration. When patient data is involved, compliance with the GDPR becomes paramount. Organizations must assess the compatibility of secondary data use while adhering to all GDPR principles and obligations to ensure ethical and legal integrity in their research practices.

More Insights

US Rejects UN’s Call for Global AI Governance Framework

U.S. officials rejected the establishment of a global AI governance framework at the United Nations General Assembly, despite broad support from many nations, including China. Michael Kratsios of the...

Agentic AI: Managing the Risks of Autonomous Systems

As companies increasingly adopt agentic AI systems for autonomous decision-making, they face the emerging challenge of agentic AI sprawl, which can lead to security vulnerabilities and operational...

AI as a New Opinion Gatekeeper: Addressing Hidden Biases

As large language models (LLMs) become increasingly integrated into sectors like healthcare and finance, a new study highlights the potential for subtle biases in AI systems to distort public...

AI Accountability: A New Era of Regulation and Compliance

The burgeoning world of Artificial Intelligence (AI) is at a critical juncture as regulatory actions signal a new era of accountability and ethical deployment. Recent events highlight the shift...

Choosing Effective AI Governance Tools for Safer Adoption

As generative AI continues to evolve, so do the associated risks, making AI governance tools essential for managing these challenges. This initiative, in collaboration with Tokio Marine Group, aims to...

UN Initiatives for Trustworthy AI Governance

The United Nations is working to influence global policy on artificial intelligence by establishing an expert panel to develop standards for "safe, secure and trustworthy" AI. This initiative aims to...

Data-Driven Governance: Shaping AI Regulation in Singapore

The conversation between Thomas Roehm from SAS and Frankie Phua from United Overseas Bank at the SAS Innovate On Tour in Singapore explores how data-driven regulation can effectively govern rapidly...

Preparing SMEs for EU AI Compliance Challenges

Small and medium-sized enterprises (SMEs) must navigate the complexities of the EU AI Act, which categorizes many AI applications as "high-risk" and imposes strict compliance requirements. To adapt...

Draft Guidance on Reporting Serious Incidents Under the EU AI Act

On September 26, 2025, the European Commission published draft guidance on serious incident reporting requirements for high-risk AI systems under the EU AI Act. Organizations developing or deploying...