Low-Resource Languages: How NLLB and Aya Are Closing the Gap

Of the world’s approximately 7,000 languages, commercial translation services like Google Translate and DeepL adequately serve perhaps 30-50. Billions of people speak languages that AI translation handles poorly or ignores entirely.

Two major projects are working to change this: Meta’s NLLB (No Language Left Behind) and Cohere for AI’s Aya initiative. This article examines what they have accomplished, how they work, and how far there is still to go.

Translation comparisons are based on automated metrics and editorial evaluation. Quality varies by language pair and content type.

The Low-Resource Problem

A language is considered “low-resource” in the context of machine translation when there is insufficient parallel text (translated sentence pairs) to train a high-quality translation model. The threshold is fuzzy, but roughly:

High-resource: 10M+ parallel sentences (English, French, Spanish, German, Chinese)
Medium-resource: 1M-10M parallel sentences (Korean, Thai, Vietnamese, Swahili)
Low-resource: 100K-1M parallel sentences (Yoruba, Igbo, Nepali, Khmer)
Very low-resource: Under 100K parallel sentences (most indigenous languages, many African and Asian languages)

The consequences of being low-resource are severe: speakers of these languages are excluded from the information economy, cannot access services in their language, and face barriers in education, healthcare, and governance.

NLLB-200: No Language Left Behind

Overview

NLLB-200 is Meta’s open-source translation model released in 2022, with ongoing improvements. It supports over 200 languages, making it the widest-coverage translation model available.

Key specs:

Languages: 200+ (including many with fewer than 1 million speakers)
Model sizes: 600M, 1.3B, 3.3B parameters
Architecture: Encoder-decoder transformer (based on M2M-100)
License: CC-BY-NC 4.0 (research use) / MIT (code)
Training data: CCMatrix, CCAligned, OPUS, WikiMatrix, plus newly mined parallel data

How NLLB Works

NLLB uses several techniques to achieve broad language coverage:

1. Massively multilingual training: Rather than building separate models for each language pair, NLLB trains a single model on all 200+ languages simultaneously. This allows knowledge transfer — patterns learned from high-resource languages help improve translation for related low-resource languages.

2. Automated parallel data mining: NLLB’s team developed tools (LASER3, stopes) to automatically find parallel sentences across the web. By comparing sentence embeddings across languages, they identified translation pairs in web-crawled data that were previously undiscovered.

3. Language-specific data auditing: For each language, the team verified that training data was actually in the claimed language (a common problem with web-crawled data) and filtered out noise and misaligned pairs.

4. Spill-over prevention: In massively multilingual models, high-resource languages can dominate, degrading performance on low-resource languages. NLLB uses temperature-based sampling to balance training across languages.

NLLB Performance by Language Tier

Tier	Example Languages	BLEU (EN→X)	Quality Assessment
High-resource	Spanish, French, German	35-42	Good but below DeepL/Google
Medium-resource	Swahili, Vietnamese, Ukrainian	25-35	Competitive with Google
Low-resource	Yoruba, Igbo, Lao	15-25	Best available option
Very low-resource	Twi, Mossi, Luganda	10-18	Functional but limited

For high-resource languages, NLLB is behind commercial systems — but that is not its purpose. Its value is in the long tail of languages where no commercial system provides adequate coverage. NLLB-200 vs Google Translate: Accuracy by Language Pair

Practical Use of NLLB

NLLB is open-source and can be deployed locally:

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model = AutoModelForSeq2SeqLM.from_pretrained("facebook/nllb-200-distilled-600M")
tokenizer = AutoTokenizer.from_pretrained("facebook/nllb-200-distilled-600M")

The model can be run on a single GPU for the smaller variants or on CPU with reduced throughput. How to Set Up NLLB-200 Locally: Tutorial

Aya: Multilingual LLM Approach

Overview

Aya is Cohere for AI’s open-science initiative to build multilingual language models. Unlike NLLB, which is a dedicated translation model, Aya is a general-purpose multilingual LLM that can perform translation alongside other tasks.

Key specs:

Languages: 101 languages (Aya Expanse), 23 languages (Aya 23)
Architecture: Decoder-only transformer
Training data: Aya Dataset (human-curated multilingual instruction data) + Aya Collection (automated multilingual data)
Key innovation: Community-driven data collection involving 3,000+ contributors from 119 countries

How Aya Differs from NLLB

Aspect	NLLB-200	Aya
Primary purpose	Translation	General multilingual AI
Architecture	Encoder-decoder	Decoder-only
Languages	200+	101 (Expanse)
Translation approach	Direct translation model	Instruction-following LLM
Customization	Fine-tuning	Prompting + fine-tuning
Other capabilities	Translation only	QA, summarization, reasoning, etc.
Data approach	Automated mining	Community + automated

Aya’s Strength: Contextual Translation

Because Aya is an instruction-following LLM, it can handle translation tasks that NLLB cannot:

Translate with context: “Translate this legal term in the context of Nigerian law”
Explain translations: “Translate this sentence and explain why you chose that word”
Adapt register: “Translate this into informal Nigerian Pidgin”
Handle ambiguity: “This sentence is ambiguous — provide translations for both interpretations”

For low-resource languages that Aya supports, this contextual capability can produce better translations than NLLB for complex or ambiguous content, even if NLLB’s raw translation quality is comparable on simple sentences.

Aya Model: 101-Language Translation Review

Other Projects Closing the Gap

Masakhane

A grassroots research community focused on NLP for African languages. Masakhane has produced translation models, datasets, and benchmarks for dozens of African languages. Their community-driven approach ensures that language speakers are involved in data creation and evaluation.

AmericasNLP

A research workshop and community focused on NLP for indigenous languages of the Americas. They organize shared tasks for machine translation of languages like Quechua, Guarani, Aymara, and Nahuatl.

OPUS-MT / Helsinki-NLP

The University of Helsinki maintains OPUS-MT, a collection of open-source translation models covering over 1,000 language pairs. While individual model quality varies, the breadth of coverage is valuable for low-resource pairs.

Google’s 1,000-Language Initiative

Google has announced a goal of building AI models that support 1,000 languages. Their Universal Speech Model and PaLM 2 efforts have expanded language coverage, though much of this work remains proprietary.

Challenges That Remain

Data Quality vs. Quantity

For low-resource languages, the available parallel data is often noisy — misaligned sentences, incorrect language labels, and low-quality translations. Simply having more data does not help if the data is unreliable. NLLB’s data auditing efforts partially address this, but it remains a fundamental challenge.

Evaluation Difficulty

How do you know if a translation into Yoruba or Lao is good? Automated metrics like BLEU require reference translations, which are scarce for low-resource languages. Human evaluation requires native speakers with translation expertise, who may be difficult to find and compensate fairly.

Dialect and Variety

Many “languages” encompass significant dialectal variation. “Arabic” includes dozens of regional varieties. “Chinese” includes Mandarin, Cantonese, and many others. Most translation systems target the standard/written variety, leaving speakers of other varieties poorly served.

Script and Encoding Issues

Some low-resource languages use scripts with incomplete Unicode support, complex rendering requirements, or multiple orthographic conventions. These technical issues can cause problems in data processing, model training, and output rendering.

Sustainability

Research projects like NLLB and Aya produce models, but who maintains them? As languages evolve and new content types emerge, models need updating. Sustainable funding and community engagement are essential for long-term impact.

Ethical Concerns

There are legitimate concerns about AI systems for indigenous and minority languages:

Who controls the data and the models?
Are language communities consulted and compensated?
Could translation systems be used for surveillance or cultural homogenization?
Are errors in sensitive contexts (medical, legal) adequately communicated?

How to Use Low-Resource Translation Today

For Developers

Start with NLLB-200 for the widest language coverage. How to Set Up NLLB-200 Locally: Tutorial
Try Aya for languages it supports, especially when contextual understanding matters.
Fall back to Google Translate for languages it covers but NLLB handles poorly.
Always communicate quality expectations — let users know that translation quality varies by language.

For Organizations

Identify your actual language needs — which low-resource languages do your users or customers speak?
Test quality on representative content before deploying. Translation AI Playground: Compare Models Side-by-Side
Combine AI with human review for anything important. Choosing a Translation Service: Human vs AI vs Hybrid
Contribute back — if you create quality translations, consider contributing them to open datasets.

For Researchers

Contribute to data collection efforts through Masakhane, AmericasNLP, or the Aya initiative.
Build evaluation resources — reference translations and human evaluation protocols for underserved languages.
Focus on real-world impact — work with communities to understand their actual translation needs.

Key Takeaways

NLLB-200 is the most comprehensive translation model for low-resource languages, covering 200+ languages with open-source availability. Its strength is breadth of coverage.
Aya brings contextual, instruction-following capabilities to multilingual AI, covering 101 languages with the ability to handle nuanced translation tasks.
Despite progress, translation quality for most low-resource languages remains significantly below what is available for major languages. Data scarcity is the fundamental bottleneck.
Community-driven efforts (Masakhane, Aya contributors, AmericasNLP) are essential because they bring language expertise that no amount of engineering can replace.
Ethical considerations — community consent, data ownership, fair compensation — must be central to low-resource language technology development.

Next Steps

Try NLLB-200: Set it up locally with our How to Set Up NLLB-200 Locally: Tutorial tutorial.
Compare NLLB with alternatives: Read NLLB-200 vs Google Translate: Accuracy by Language Pair for a detailed comparison.
Explore the Aya model: See our Aya Model: 101-Language Translation Review for a comprehensive review.
Find the best tool for rare languages: Check Best Translation AI for Rare/Low-Resource Languages for recommendations.
See all language pair rankings: Visit the Translation Accuracy Leaderboard by Language Pair.

Sources

Meta AI: NLLB-200 — No Language Left Behind — accessed March 26, 2026
Cohere: Aya — Multilingual AI for Everyone — accessed March 26, 2026

Low-Resource Languages: How NLLB and Aya Are Closing the Gap

The Low-Resource Problem

NLLB-200: No Language Left Behind

Overview

How NLLB Works

NLLB Performance by Language Tier

Practical Use of NLLB

Aya: Multilingual LLM Approach

Overview

How Aya Differs from NLLB

Aya’s Strength: Contextual Translation

Other Projects Closing the Gap

Masakhane

AmericasNLP

OPUS-MT / Helsinki-NLP

Google’s 1,000-Language Initiative

Challenges That Remain

Data Quality vs. Quantity

Evaluation Difficulty

Dialect and Variety

Script and Encoding Issues

Sustainability

Ethical Concerns

How to Use Low-Resource Translation Today

For Developers

For Organizations

For Researchers

Key Takeaways

Next Steps

Sources

More in Analysis