Consent-Centric Data Challenges for AI Development in India

AI at a Crossroads: Navigating Consent-Centric Data in India

In the purview of Artificial Intelligence (AI), data is a key driving force for training advanced AI models. Advanced AI systems such as Large Language Models (LLMs) thrive on large volumes of high-quality datasets. However, the Indian Digital Personal Data Protection (DPDP) Act and its rules based on express, informed, and continuous consent pose ethical and practical considerations. This article aims to identify the implications of the DPDP Act’s consent-centric nature on AI development, especially in sectors that require curated, proprietary data.

The Consent-Centric Data Governance

India’s DPDP Act represents a significant milestone in the country’s data protection approach. The official DPDP Rules emphasize that every data point shall be collected according to the principle of consent by the data subject. It also excludes publicly available data in some instances. Unlike the European Union’s (EU) General Data Protection Regulation (GDPR) and Brazil’s Lei Geral de Proteção de Dados (LGPD), this framework narrowly recognises only consent as a valid processing basis, overlooking alternative legal mechanisms such as contractual necessity and legitimate interests that provide processing flexibility under leading international data protection regimes.

With rapidly unfolding AI development, consent as the basis for data protection is working at cross purposes with the prevailing mode of data collection to train large AI models. While the DPDP Act aims to protect an individual’s rights in a way that makes data collection practices more transparent and accountable, this regulatory development comes at a time when AI developers increasingly need data that is not easily accessible to the public. Ernst & Young’s (EY) comprehensive reports on sectoral AI development highlight the essentiality of high-quality, carefully curated datasets for effective LLM training. This finding is further corroborated by analyses of the specific challenges involved in developing generative models. The focus on explicit, granular consent poses a significant conundrum in such a context. How can the underlying consent-centric framework for data protection be reconciled with data requirements for AI innovation?

The Conundrum of Curated Data for Sector-Specific AI

The foundation of AI systems like LLMs rests entirely on their training data. In critical sectors such as healthcare, banking, and online advertising, data collection follows regulated protocols, often drawing from exclusive sources inaccessible to the general public. Within the DPDP Act framework, a consent manager is defined as an entity officially registered with the Data Protection Board of India. It provides a transparent, accessible, and interoperable platform that empowers data principals to grant, manage, review, and revoke their consent and serve as the primary intermediary between individuals and businesses.

However, this consent-based approach creates a fundamental tension in AI development. Requiring case-by-case consent significantly reduces the volume of available training data, creating a complex challenge with multiple dimensions. While consent-centric frameworks aim to build trust and ensure data subjects maintain control, they also introduce new problems for AI innovation. For instance, an additional layer of complexity arises at the intersection of data protection and copyright law. Recent cases highlight the legal issues that develop when curated data protected by copyrights is used to train AI models.

Given that LLMs need vast datasets to function, whether they can negotiate consent for each data element, including copyrighted content, is an enduring question. The tension between the need for comprehensive datasets and the stringent requirements for consent illustrates the challenges that lie ahead for AI developers in India.

Global Perspectives on Privacy and Innovation

Outside the Indian context, similar findings give a different perspective to the balance between privacy and innovation. Reports indicate that applying consent models in the current environment where AI is data-intensive may be difficult. Arguments have been made that privacy cannot remain acceptable only based on individual consent as presupposed by previous frameworks. The conflict between data protection and data utility has been highlighted in various studies, substantiating that even though the DPDP Act is ethical in its consent-centric approach, the lack of sufficient flexibility in its implementation may hinder technological advancement.

A flexible framework, as envisioned in best practices like those embodied in the EU AI Act, outlines responsible data governance and management practices and alternative anonymisation techniques that factor in context in addition to identifiers. These approaches highlight the importance of contextualising privacy protection within comprehensive risk assessment frameworks, vulnerabilities arising from data linkages, and the consequent risk interfaces create.

Balancing Innovation with Ethical Imperatives

The issue, therefore, is to find the balance between two opposite and equally important goals. On one hand, the ethical and legal positions aim to protect an individual’s privacy by ensuring that they consent knowingly and can withdraw their consent at any time. On the other hand, there are technological demands for big and organised data sets for AI development. The further development of AI in India depends on the availability of structured data, which can be accessed through specific mechanisms. However, these mechanisms must conform to the DPDP Act.

Technological solutions, such as Consent Managers, help in consent management more efficiently and manually while maintaining proper records and audit trails, but they add an extra layer of compliance. Blockchain technology can also be used to make the records of consent unalterable and transparent. When used together with methods such as subjective anonymisation, data analyses can help protect individual identities. These tools create a data environment that respects the subject’s rights and the development of AI technologies.

However, policy adaptations are also crucial. Standardised consent templates fail to reduce consent fatigue among the subjects and the researchers. There might be a need to allow sector-specific exemptions and regulatory sandboxes owing to the nature of business conducted in some industries that require curated data. Regulations could permit limited data sharing with the necessary conditions put in place to protect individuals’ privacy and consent while at the same time providing LLMs and other AI systems with the quality data they need.

Conclusion

India’s consent-based data protection regime, while protecting individual rights through informed consent mechanisms, might create operational challenges for AI innovation. The balance between privacy protection and technological innovation will depend on identifying effective solutions like responsive risk-based regulatory frameworks, including sandboxes and exemptions, and moving towards industry-led methods. This will help policymakers and industry leaders collaboratively design an ethical framework conducive to AI-driven progress, ensuring that India remains at the forefront of responsible technological evolution.

More Insights

Revolutionizing Drone Regulations: The EU AI Act Explained

The EU AI Act represents a significant regulatory framework that aims to address the challenges posed by artificial intelligence technologies in various sectors, including the burgeoning field of...

Revolutionizing Drone Regulations: The EU AI Act Explained

The EU AI Act represents a significant regulatory framework that aims to address the challenges posed by artificial intelligence technologies in various sectors, including the burgeoning field of...

Embracing Responsible AI to Mitigate Legal Risks

Businesses must prioritize responsible AI as a frontline defense against legal, financial, and reputational risks, particularly in understanding data lineage. Ignoring these responsibilities could...

AI Governance: Addressing the Shadow IT Challenge

AI tools are rapidly transforming workplace operations, but much of their adoption is happening without proper oversight, leading to the rise of shadow AI as a security concern. Organizations need to...

EU Delays AI Act Implementation to 2027 Amid Industry Pressure

The EU plans to delay the enforcement of high-risk duties in the AI Act until late 2027, allowing companies more time to comply with the regulations. However, this move has drawn criticism from rights...

White House Challenges GAIN AI Act Amid Nvidia Export Controversy

The White House is pushing back against the bipartisan GAIN AI Act, which aims to prioritize U.S. companies in acquiring advanced AI chips. This resistance reflects a strategic decision to maintain...

Experts Warn of EU AI Act’s Impact on Medtech Innovation

Experts at the 2025 European Digital Technology and Software conference expressed concerns that the EU AI Act could hinder the launch of new medtech products in the European market. They emphasized...

Ethical AI: Transforming Compliance into Innovation

Enterprises are racing to innovate with artificial intelligence, often without the proper compliance measures in place. By embedding privacy and ethics into the development lifecycle, organizations...

AI Hiring Compliance Risks Uncovered

Artificial intelligence is reshaping recruitment, with the percentage of HR leaders using generative AI increasing from 19% to 61% between 2023 and 2025. However, this efficiency comes with legal...