English to Yoruba: AI Translation Comparison

Yoruba is spoken by over 45 million people, primarily in southwestern Nigeria, Benin, and Togo, with significant diaspora communities worldwide. It is a tonal language with three lexical tones (high, mid, low) that are essential for meaning — the same sequence of consonants and vowels can produce entirely different words depending on tone. Demand for English-to-Yoruba translation is driven by Nigerian government services, education, media, Nollywood film subtitling, and digital content localization for Nigeria’s massive internet population.

This comparison evaluates five leading AI translation systems on English-to-Yoruba accuracy, naturalness, and suitability for different use cases.

Translation comparisons are based on automated metrics and editorial evaluation. Quality varies by language pair and content type.

Accuracy Comparison Table

System	BLEU Score	COMET Score	Editorial Rating (1-10)	Best For
Google Translate	18.3	0.724	5.4	General-purpose, broadest data
DeepL	14.6	0.691	4.4	Very limited Yoruba support
GPT-4	20.9	0.742	5.9	Contextual accuracy, tonal awareness
Claude	18.8	0.728	5.5	Long-form content
NLLB-200	22.1	0.753	6.2	Best Yoruba coverage, self-hosted

Translation Quality Metrics: BLEU, COMET, and Human Evaluation Explained

Example Translations

Formal Business Email

Source: “We are pleased to inform you that your application has been approved. Please find the relevant documentation attached.”

System	Translation
Google	A ni idunnu lati fi to yin leti pe ohun elo yin ti je itewogba. Ejowo, e wa awon iwe afikun ti a so po.
DeepL	A ni idunnu lati so fun yin pe ohun elo yin ti gba ifọwọsi. Jọwọ wa awọn iwe ti o yẹ ti a so mọ.
GPT-4	A ni inudidun lati fi to yin leti pe iwe ibere yin ti gba ifọwọsi. Ẹ jọwọ wo awọn iwe afikun ti a so pọ mọ lẹta yii.
Claude	A ni inudidun lati fi to yin leti pe ohun elo yin ti je itẹwọgba. Ẹ jọwọ, ẹ wo awọn iwe to yẹ ti a fi kun.
NLLB-200	A ni inudidun lati fi to yin leti pe iwe ibere yin ti gba ifọwọsi. Ẹ jọwọ wo awọn iwe afikun ti a so pọ mọ.

Assessment: NLLB-200 and GPT-4 produce the most natural formal Yoruba, with correct use of diacritics (subdots and tone marks). Google’s output omits most diacritics, which significantly impairs readability and can change word meanings. The presence of proper Yoruba diacritics (ẹ, ọ, ṣ) is a basic quality indicator for this language.

Casual Conversation

Source: “Hey, I was thinking we could grab some food later. What do you feel like eating?”

System	Translation
Google	Hey, mo n ro pe a le lo mu ounje nigbamii. Kini o fe je?
DeepL	Hey, mo n ro pe a le gba ounje nigbamii. Kini o fe je?
GPT-4	Ẹ n lẹ, mo n ro pe a le lọ jẹun nigbamii. Kini o fẹ jẹ?
Claude	Bawo, mo n ro pe a le lọ mu ounjẹ nigbamii. Kini o fẹ jẹ?
NLLB-200	Mo n ro pe a le lọ mu ounjẹ nigbamii. Kini o fẹ jẹ?

Assessment: GPT-4 and Claude attempt Yoruba greetings instead of transliterating “Hey.” Claude’s “Bawo” is a natural casual greeting. Google and DeepL keep “Hey” untranslated, which is actually common in modern Nigerian Yoruba, reflecting English influence. All systems struggle to fully capture casual Nigerian Yoruba, which is heavily influenced by code-switching with English.

Technical Content

Source: “The API endpoint accepts POST requests with a JSON body containing the source text and target language code.”

System	Translation
Google	API endpoint gba awon ibere POST pelu JSON body ti o ni orisun ọrọ ati koodu ede ibi-afẹde.
DeepL	Aaye ipari API gba awọn ibeere POST pẹlu ara JSON ti o ni ọrọ orisun ati koodu ede ibi-afẹde.
GPT-4	API endpoint n gba awọn POST requests pẹlu JSON body ti o ni source text ati target language code ninu.
Claude	API endpoint n gba awọn ibeere POST pẹlu JSON body ti o ni ọrọ orisun ati koodu ede ti a n wa.
NLLB-200	Aaye ipari API n gba awọn ibeere POST pẹlu ara JSON ti o ni ọrọ orisun ati koodu ede afojusun ninu.

Assessment: GPT-4 keeps English technical terms intact, reflecting how Nigerian Yoruba developers actually communicate. DeepL and NLLB-200 translate “endpoint” as “aaye ipari” (end point), which has no established usage in Yoruba tech communities. Best Translation AI for Technical Documentation

Strengths and Weaknesses

Google Translate

Strengths: Accessible and free. Benefits from Nigerian web content. Weaknesses: Frequently omits Yoruba diacritics (subdots and tone marks), which can change word meanings entirely. Quality below the other systems for this pair.

DeepL

Strengths: Basic grammatical structure is usually correct. Weaknesses: Very limited Yoruba support. Lowest quality overall. Poor diacritical mark handling. Unnatural word choices.

GPT-4

Strengths: Best contextual understanding. Better diacritical mark usage than Google. Can handle English-Yoruba code-switching naturally. Best tonal awareness in text output. Weaknesses: Expensive. Not always consistent with tone marks across longer texts.

Claude

Strengths: Consistent output for long documents. Reasonable diacritical mark handling. Weaknesses: Less natural than GPT-4 for idiomatic Yoruba. Limited casual register capability.

NLLB-200

Strengths: Best free option for Yoruba by a significant margin. Meta’s NLLB project made Yoruba a priority language. Consistent diacritical mark usage. Self-hostable. Weaknesses: No register control. Over-translates English terms in technical content. Cannot handle code-switching.

Recommendations

Use Case	Recommended System
Quick personal translation	Google Translate (free, but verify diacritics)
Government / official documents	GPT-4 with human review
Educational material	NLLB-200
Media / Nollywood subtitles	GPT-4
Technical documentation	GPT-4
High-volume, cost-sensitive	NLLB-200 (self-hosted)
Long-form content	Claude

Best Translation AI in 2026: Complete Model Comparison

Key Takeaways

NLLB-200 leads as the best free option, outperforming Google Translate on Yoruba thanks to Meta’s targeted investment in African languages. GPT-4 offers the highest contextual quality at a cost.
Diacritical marks are not optional in Yoruba — they distinguish different words. Systems that omit subdots and tone marks produce ambiguous or incorrect text.
Code-switching between English and Yoruba is extremely common in Nigerian communication. GPT-4 handles this best; NMT systems struggle with mixed-language input.
All systems produce lower quality for Yoruba than for high-resource languages. Human review is essential for published content.

Next Steps

Try it yourself: Compare these systems on your own text in the Translation AI Playground: Compare Models Side-by-Side.
Low-resource languages: Learn more in Low-Resource Languages: Where NLLB and Aya Shine.
Check the leaderboard: Browse our full Translation Accuracy Leaderboard by Language Pair.
Full model comparison: Read Best Translation AI in 2026: Complete Model Comparison.