Home

Donate
Perspective

The UN Global Dialogue on AI Governance Should Tackle the AI Language Gap

Christian Schlaepfer / Jul 1, 2026
Republish

The United Nations will convene the Global Dialogue on AI Governance on July 6-7 in Geneva. The gathering is open to all UN member states and hundreds of stakeholders. Heated consultations on the agenda of the Dialogue have produced a draft program that is predictably generic. Such thematic breadth is unavoidable and largely by design in UN processes.

Nevertheless, the Dialogue is an opportunity to zoom in on policy areas that receive insufficient attention elsewhere, and to generate real political momentum. As the Dialogue will have a uniquely high concentration of representatives of diverse language communities, one of these issues is multilingual artificial intelligence.

Why should policymakers care about multilingual AI?

Current AI systems are less accessible, less useful and less safe for users of so-called low-resource languages. These are languages which may well be spoken by many but for which little digitized data is available to power large language models. Market forces have not mobilized the necessary investment to address these shortcomings. Fierce economic and geopolitical competition funnels attention and resources into the development of a narrow set of frontier models, which are optimized for a small set of dominant and well-resourced languages.

This imbalance results in a global inequity that warrants policymakers’ attention in and of itself. But it also creates a cumulative advantage for those benefiting from unfettered access to more powerful AI systems over time. Simply put, the longer the gap exists, the wider it becomes. If AI generates a net positive impact on socio-economic development, as is widely assumed, addressing this imbalance is urgent.

AI models underperform in low-resource languages

The systemic weaknesses of low-resource language AI models are well studied by a growing body of academic research.

Access to frontier AI capacity is limited by a lack of infrastructure and high cost, especially in low-income countries. And even for those with technical access, language can be a constraint. Frontier models are developed and optimized only in a few well-resourced languages, resulting in a “language gap.” The development of localized models optimized for specific linguistic or cultural objectives has seen progress in recent years and some commercial labs scale up relevant research, but low-resource languages remain largely underserved.

Models can process prompts in different languages today. But on top of generating poorer quality outputs in low-resource languages, multilingual capacity alone does not remove all access barriers. Researchers point to a so-called ‘token tax’; access to models is usually billed by use of tokens, the units into which natural language is split for processing. Studies have shown that low-resource and morphologically complex languages systematically require more tokens to represent the same content compared to English, which drives up cost and latency. In other words, using an AI model optimized for English in a low-resource language is more expensive and slower.

AI systems are less useful for low-resource language users. Their performance is significantly lower when prompted in those languages. Simply translating prompts into English cannot fully close this gap. Some researchers have even identified performance differences for specific tasks prompted by native versus non-native speakers of English.

Similarly, models generally reason more effectively in English than in another language, studies show. Beyond accuracy, pivoting to English for internal reasoning also risks producing outputs that are biased towards the linguistic and cultural norms encoded in English. Poorer representation of a language in training data and, consequently, the concept space underpinning a model lead directly to weaker cultural and linguistic alignment.

The language gap in AI systems is a safety risk. Researchers have shown that it is possible to jailbreak AI systems — that is, to breach their safeguards — by machine-translating prompts into low-resource languages. In addition to such vulnerabilities, there is also a broader safety gap: studies have demonstrated that translating malicious input into a low-resource language is more likely to generate unsafe content. Most safety research is conducted in English and a handful of high-resource languages, and the alignment stage requires manually annotated data, which is even harder to come by for low-resource languages than unlabeled data used in pretraining.

What difference can the Dialogue make?

The reason for the weaker performance of models in low-resource languages is not that these are somehow categorically less suitable for LLMs; rather, they are caused by specific model design choices and the lack or limited availability of training datasets. Until now, market forces have not redirected resources to address this, and are unlikely to do so. Policymakers can correct course by building momentum and awareness and designing policy incentives. The upcoming Dialogue at the UN is a timely opportunity to tackle the issue. It should consider the following:

  • Generate political momentum by reframing the challenge: multilingual performance of AI systems is commonly framed as an inequity issue or is subsumed under more general labels such as ‘responsible and inclusive AI’. This does not adequately highlight the safety concerns and cumulative disadvantages users are exposed to over time when using AI in low-resource languages. Participants at the Dialogue should frame investment in multilingual capacity as an urgent issue of national or regional interest, on a par with efforts to bolster ‘sovereign’ AI. Many countries and regions have started investing in local models in an effort to increase ownership and control over AI systems. Models optimized for local languages would make it faster and cheaper to train on local data and context.
  • Amplify technical solutions and help prioritize targeted R&D investment: researchers point to technical solutions on at least three tracks: more multilingual datasets, more engagement of human annotators from diverse linguistic and cultural backgrounds and more priority and compute for multilingual LLM research, especially on safety. Simply calling for more investment in thousands of underserved languages, however, is hardly practical; and neither is a purely national approach where every country launches its own initiative. But the Dialogue can articulate top priorities by clustering, aggregating, and organizing needs, interests and resources across national or regional borders. It can identify principles or criteria to determine where to begin: to decide which investment in which language would generate the most benefit for the most people. And it should serve as a platform to promote co-creation and co-design approaches, to better involve the communities this technology is intended to serve in the relevant research and development processes.
  • Share policy levers and incentives to strengthen multilingual capacity: researchers have suggested requiring more transparency from model developers to articulate and document the language coverage of their products explicitly. As pre-deployment testing is being institutionalized more broadly, the Dialogue presents a timely opportunity to press for the inclusion of language-specific vulnerability checks in these processes. But more broadly, representatives of low-resource language communities could use the Dialogue to strategize on how to leverage access to their markets and data to affect change in frontier model development. Permission to process local datasets, especially when mediated through official institutions such as archives, public broadcasters or cultural institutions, could be tied to commitments by model developers to improve multilingual performance and safety. Licensing agreements between media companies and AI labs are an example of how access to data can be leveraged. Similarly, public procurement processes could require specific investment in multilingual capacity as a condition for government contracting. The Dialogue is not the place to negotiate such arrangements, but to build alliances and coalitions that lead to better bargaining positions. The languages may vary across regions, but the underlying demand is the same.

By delivering on a tangible issue such as multilingual artificial intelligence, the UN’s first Global Dialogue on AI Governance has a shot at creating genuine added value, and proving its usefulness as a new platform. Tackling the systemic disadvantage faced by users of AI systems in low-resource languages addresses a real need that is not being solved elsewhere. And by prioritizing it now, it can help close a gap in access, usefulness and safety that — unchecked — grows wider the longer it exists.

A dialogue platform has real limitations. It's unrealistic to expect it to boost capacity across all, or even most, languages spoken globally. The Dialogue can, however, reframe the issue as an urgent policy challenge that goes well beyond equity, articulate and prioritize the most promising areas of research and development, and aggregate political will and bargaining power to enable representatives of low-resource language communities to accelerate progress by influencing frontier model development more effectively. That would be a useful and impactful contribution.

Support Tech Policy Press
If you've found our work helpful, consider supporting us.

Authors

Christian Schlaepfer
Christian Schlaepfer is a former Swiss diplomat and negotiator of tech and AI policy at the United Nations. He is a guest at the Institute for Logic, Language and Computation at the University of Amsterdam and policy advisor at the UN-focused think tank Starling Institute.

Topics

Related

Perspective
UN Launches AI Panel and Dialogue, But Questions Linger Over Inclusion and ImpactSeptember 11, 2025