Data Notice: Figures, rates, and statistics cited in this article are based on the most recent available data at time of writing and may reflect projections or prior-year figures. Always verify current numbers with official sources before making financial, medical, or educational decisions.

Aya Model: 101-Language Translation Review

Name: Aya Model: 101-Language Translation Review
Creator: NLLB
Published: 2026-03-08
License: https://creativecommons.org/licenses/by-nc/4.0/

How We Evaluated: Our editorial team researched Aya Model translation quality using BLEU and COMET automated metrics, editorial side-by-side evaluation, and native-speaker fluency ratings. Rankings reflect translation accuracy, naturalness, handling of idioms, and suitability for formal vs. casual contexts. Last updated: March 2026. See our editorial policy for full methodology.

Aya is Cohere for AI’s open-science initiative to build multilingual AI that works across 101 languages. Unlike dedicated translation models, Aya is a general-purpose multilingual LLM — it can translate, answer questions, summarize, and reason across its supported languages. This review evaluates its translation capabilities specifically.

Translation comparisons are based on automated metrics and editorial evaluation. Quality varies by language pair and content type.

What Is Aya?

Aya is a family of models built by a global community of over 3,000 contributors across 119 countries. Key releases include:

Aya Dataset: 513,000 human-curated multilingual instruction-response pairs
Aya Collection: Larger automated dataset covering 114 languages
Aya 23: Models fine-tuned on 23 languages (8B and 35B parameters)
Aya Expanse: Extended coverage model supporting 101 languages

Architecture

Aya models are decoder-only transformers (based on the Command R architecture) fine-tuned for multilingual instruction following. This is a fundamentally different approach from NLLB-200’s encoder-decoder translation architecture.

Supported Languages

Aya Expanse covers 101 languages, including many that are typically underserved: Yoruba, Igbo, Hausa, Swahili, Amharic, Somali, Zulu, Malagasy, Cebuano, Javanese, Sundanese, and many more.

Translation Quality Assessment

Comparison with NLLB-200

Language Pair	Aya Expanse (BLEU)	NLLB-200 3.3B (BLEU)	Winner
EN → ES	40.8	39.7	Aya (+1.1)
EN → FR	40.2	39.4	Aya (+0.8)
EN → DE	37.1	36.4	Aya (+0.7)
EN → SW	23.1	22.5	Aya (+0.6)
EN → YO	16.8	17.3	NLLB (+0.5)
EN → IG	15.2	15.9	NLLB (+0.7)
EN → HA	18.5	17.9	Aya (+0.6)

Pattern: Aya slightly outperforms NLLB on medium-resource and some high-resource languages. NLLB has a small edge on very low-resource languages, likely due to its broader language set and translation-specific optimization.

Comparison with Commercial Systems

Language Pair	Aya Expanse	Google Translate	GPT-4
EN → ES	40.8	42.3	43.5
EN → SW	23.1	23.2	24.8
EN → YO	16.8	14.8	15.2

Pattern: Aya is behind commercial systems for high-resource languages but competitive or ahead for low-resource African languages — particularly versus Google Translate.

Aya’s Unique Strengths

1. Contextual Translation

As an instruction-following LLM, Aya can handle contextual translation that pure translation models cannot:

Translate the following Yoruba proverb into English,
preserving its cultural meaning rather than translating literally:
"Àgbà kì í wà lójà, kí orí ọmọ títún wó."

Aya can provide both a translation and a cultural explanation, something NLLB-200 cannot do.

2. Multi-Task Multilingual

Aya can translate, summarize, answer questions, and generate content in any of its 101 languages. For applications that need more than just translation, Aya provides a single model rather than requiring separate systems.

3. Community-Curated Data

Aya’s training data was partly curated by native speakers from 119 countries, reducing the bias toward English-centric or web-crawled data that affects many models. This community approach helps capture natural language patterns that automated data collection misses.

4. Cultural Sensitivity

The community curation process included cultural context that helps Aya handle culturally sensitive topics more appropriately across different language communities.

Aya’s Limitations

1. Fewer Languages Than NLLB

Aya covers 101 languages compared to NLLB’s 200+. If you need a language in NLLB’s set but not Aya’s, NLLB is the only option.

2. Not Translation-Optimized

Aya is a general-purpose model. For pure, high-volume translation, a dedicated translation model will typically be more efficient in terms of speed and compute cost.

3. Hallucination Risk

Like any LLM, Aya can occasionally add information not present in the source text or produce plausible-sounding but incorrect translations. This risk is lower in dedicated translation models like NLLB.

4. Larger Compute Requirements

Aya’s models are larger than NLLB’s smallest variants, requiring more GPU memory and compute for deployment.

5. Instruction Sensitivity

Translation quality depends on how you prompt Aya. Poorly worded prompts can produce suboptimal results. NLLB, by contrast, takes source text and language codes — no prompt engineering required.

When to Use Aya

Mid-resource languages where NLLB quality is mediocre: Aya’s LLM architecture can sometimes produce more natural translations for languages with moderate training data.
Context-dependent translation: When you need to provide domain context, audience information, or translation instructions.
Multi-task multilingual applications: When your application needs translation plus other NLP capabilities in the same languages.
Community-supported languages: For languages where Aya’s community curation provides better training data than automated methods.

When to Use NLLB Instead

Maximum language coverage: 200+ languages vs 101.
High-volume translation: NLLB is faster and more efficient for pure translation.
Consistency: NLLB produces deterministic output. Aya’s output varies.
Very low-resource languages: NLLB has a slight edge for the most underserved languages.
Simple integration: No prompt engineering needed. How to Set Up NLLB-200 Locally: Tutorial

Key Takeaways

Aya brings the flexibility of an instruction-following LLM to multilingual AI, enabling contextual and multi-task capabilities that dedicated translation models lack.
For pure translation quality, Aya is competitive with NLLB-200 for medium-resource languages and slightly behind for very low-resource ones.
Aya’s community-driven data curation is a genuine differentiator, producing training data that reflects natural language use across diverse cultures.
Choose Aya when you need contextual translation or multi-task multilingual capabilities. Choose NLLB when you need maximum language coverage or efficient high-volume translation.

Next Steps

Compare with NLLB-200: Read NLLB-200 vs Google Translate: Accuracy by Language Pair for NLLB’s positioning.
Explore low-resource translation: See Low-Resource Languages: How NLLB and Aya Are Closing the Gap.
Find the best tool for rare languages: Check Best Translation AI for Rare/Low-Resource Languages.
Try translation models: Use the Translation AI Playground: Compare Models Side-by-Side.
Full model comparison: Read Best Translation AI in 2026: Complete Model Comparison.

Sources

Cohere for AI: Aya Model — accessed March 25, 2026
arXiv: No Language Left Behind (2207.04672) — accessed March 25, 2026