Building AI for Everyone: The Essential Role of Inclusive Data

Designing More Inclusive AI Starts with Data Architecture

Artificial Intelligence (AI) must be built on inclusive and representative data — and that starts with access. The modernizing of data infrastructure is essential for scaling AI responsibly and securely. Inclusive AI requires collaboration, governance, and a long-term commitment to ethical design.

As AI reshapes the global economy, it becomes increasingly clear that AI is only as powerful, fair, and insightful as the data it learns from. Today, that data tells an incomplete story. In a world where nearly 2.6 billion people remain offline, the datasets underpinning AI systems do not yet reflect the full diversity of human experience.

Take language as an example. There are over 7,000 languages spoken globally, yet most AI chatbots are trained on only around 100 of them. English, spoken by less than 20% of the world’s population, dominates nearly two-thirds of all web content and remains the primary driver of large language models (LLMs). This disparity presents not just an issue of inclusion; it is fundamentally a data problem. With uneven levels of digitization, many underrepresented languages lack structured digital content, making it difficult for AI systems to learn from them.

This data problem is more than a technical oversight — it is a societal risk. Without intentional design, AI will continue to exclude large segments of the world, reinforcing existing inequalities and missing out on the rich perspectives of underrepresented communities.

Closing this gap requires a fundamental shift in how we build AI. It means developing diverse tools and models tailored to different linguistic ecosystems. Regionally trained LLMs, like Jais, and open-source models, like Falcon, are promising examples of how to reflect the contextual nuances of non-English speaking populations and bring cultural relevance into the heart of AI design.

Connectivity is Only the Starting Line

Inclusive AI begins with access. Nearly one-third of the global population still lacks reliable internet access, making those communities invisible to the algorithms driving our economies. While global connectivity has reached 68%, with 5.5 billion people online and 5G coverage now reaching 51% of the world population, the gaps remain deep. In low-income countries, 5G penetration is just 4%.

To help close this gap, significant investments in expanding affordable network access are crucial. But inclusion doesn’t end with connection. Developing robust data infrastructure is the critical next step to ensure that those who come online are seen, heard, and represented in AI systems.

From Fragmentation to Federated Models

The MENA region is at a turning point. Governments have launched ambitious AI strategies, invested in infrastructure, and nurtured innovation ecosystems. AI adoption is accelerating, but trust in digital infrastructure is faltering. This signals a pressing need to move from siloed legacy systems to scalable, AI-ready data architectures. Federated data models, for example, offer a path forward, allowing organizations to share insights across borders without compromising privacy or ownership.

Trust is Designed, Not Assumed

Data privacy and regulatory compliance are significant barriers to AI adoption. As AI becomes more embedded in public services, finance, healthcare, and education, trust is paramount. Fragmented regulations, outdated governance structures, and escalating cybersecurity threats can slow progress and erode public confidence. Responsible AI must be coded into systems from day one, not added later as an afterthought.

The Power of Strategic Public-Private Partnerships

No single actor can build inclusive AI alone. Meaningful progress in AI inclusion will only come from collaborative ecosystems, and public-private partnerships are essential to scaling responsible innovation. By bringing people online and empowering them to participate in the digital economy, we ensure their inclusion in the datasets and systems shaping our future.

Investing in Data Foundations is Non-Negotiable

Modernizing data architecture is now a higher priority for business leaders than even generative AI itself. AI is only as powerful as the data it can access. Poor-quality, fragmented, incomplete, or biased data leads to flawed insights and unreliable automation, widening inequality and increasing risk. A unified enterprise-wide data strategy underpinned by governance is key to unlocking AI at scale.

Inclusive AI Must Be Intentional

The future will be defined by those who invest early — and wisely — in the foundations of inclusive technology. This involves designing systems that serve not just the connected but also actively include those still on the margins. Nearly 2.6 billion people are still offline, leaving their voices unheard and experiences unseen. As we look to the future, we all share a responsibility to consider whose reality our AI reflects.

To truly serve all humanity, we must be deliberate in how we build AI and explicitly code inclusivity into its core system from the start.