Pashto to Arabic: AI Translation Comparison

Pashto is spoken by approximately 50 million people, primarily in Afghanistan and the northwestern regions of Pakistan, with significant diaspora communities in the Gulf states, Iran, and Western countries. Arabic serves over 400 million speakers across more than 20 countries. Both languages use Arabic-derived scripts (Pashto uses a modified Arabic alphabet with additional letters), and both have absorbed extensive Islamic religious vocabulary from classical Arabic. Pashto is an Eastern Iranian language (Indo-European family) with SOV word order, grammatical gender, and a complex verb system, while Arabic is Semitic (Afro-Asiatic) with VSO default word order and a root-and-pattern morphological system. Translation demand is driven by religious scholarship, migrant worker communications (large Afghan/Pashtun communities in Gulf states), humanitarian aid documentation, and Islamic educational materials.

This comparison evaluates five leading AI translation systems on Pashto-to-Arabic accuracy, naturalness, and suitability for different use cases.

Translation comparisons are based on automated metrics and editorial evaluation. Quality varies by language pair and content type.

Accuracy Comparison Table

System	BLEU Score	COMET Score	Editorial Rating (1-10)	Best For
Google Translate	20.6	0.719	5.0	General-purpose, free access
DeepL	16.3	0.681	4.2	Not natively supported for Pashto
GPT-4	25.4	0.764	6.2	Contextual accuracy, religious content
Claude	23.1	0.745	5.8	Long-form content, humanitarian texts
NLLB-200	26.8	0.779	6.5	Direct translation path, self-hosted

Translation Quality Metrics: BLEU, COMET, and Human Evaluation Explained

Example Translations

Formal Humanitarian Communication

Source (Pashto): “د افغانستان د بیا رغونې لپاره نړیوالې مرستې ته اړتیا ده. د روغتیایي خدمتونو ته لاسرسی او د زده کړو بیا پیل کول لومړیتوبونه دي.”

System	Translation
Google	هناك حاجة إلى المساعدات الدولية لإعادة إعمار أفغانستان. الوصول إلى الخدمات الصحية واستئناف التعليم من الأولويات.
DeepL	هناك حاجة للمساعدات الدولية لإعادة بناء أفغانستان. الحصول على الخدمات الصحية وإعادة بدء التعليم من الأولويات.
GPT-4	تحتاج أفغانستان إلى دعم المجتمع الدولي من أجل إعادة الإعمار. ويُعدّ ضمان الوصول إلى الخدمات الصحية واستئناف العملية التعليمية من أبرز الأولويات الراهنة.
Claude	هناك حاجة إلى المساعدات الدولية لإعادة إعمار أفغانستان. يعد الوصول إلى الخدمات الصحية واستئناف التعليم من الأولويات.
NLLB-200	هناك حاجة إلى المساعدة الدولية لإعادة إعمار أفغانستان. الوصول إلى الخدمات الصحية واستئناف التعليم من الأولويات.

Assessment: GPT-4 produces the most polished humanitarian Arabic with “دعم المجتمع الدولي” (support of the international community), “ضمان الوصول” (ensuring access), and “العملية التعليمية” (educational process), all standard UN-style Arabic humanitarian language. NLLB-200 provides a clean, accurate direct translation benefiting from dedicated Pashto-Arabic training data. DeepL routes through English as a pivot, losing the direct religious and cultural connections between these two language communities.

Casual Conversation

Source (Pashto): “نن ډېره ګرمي ده پشاور کې! راځه چای وڅښو او کباب وخورو. د نمرې لاندې ناست يو تر ماښامه.”

System	Translation
Google	الجو حار جداً في بيشاور اليوم! تعال نشرب الشاي ونأكل الكباب. نجلس تحت الشمس حتى المساء.
DeepL	الطقس حار جداً في بيشاور اليوم! هيا نشرب الشاي ونتناول الكباب. نجلس تحت أشعة الشمس حتى المساء.
GPT-4	الجو حار جداً اليوم في بيشاور! يلا نروح نشرب شاي ونأكل كباب. نقعد في الظل لين المغرب.
Claude	الجو حار جداً في بيشاور اليوم! تعال نشرب الشاي ونأكل الكباب. نجلس تحت الشمس حتى المساء.
NLLB-200	الجو حار جداً في بيشاور اليوم! تعال نشرب الشاي ونأكل الكباب. نجلس تحت الشمس حتى المساء.

Assessment: GPT-4 captures the casual register with Gulf-dialect-influenced Arabic (“يلا نروح,” “نقعد,” “لين المغرب”), which is contextually appropriate since many Pashtun speakers in the Arab world are in Gulf countries. GPT-4 also corrects the cultural logic: “نقعد في الظل” (sit in the shade) rather than “تحت الشمس” (under the sun) — in extreme heat, people seek shade, and the Pashto “د نمرې لاندې” can contextually mean sitting outdoors/in the shade. Other systems translate literally, missing the cultural intent.

Religious Text

Source (Pashto): “د صبر او شکر لار مو تعقیب کړئ. په نېکۍ او تقوا کې یو له بل سره مرسته وکړئ. الله تعالی هر عمل وینې.”

System	Translation
Google	اتبعوا طريق الصبر والشكر. ساعدوا بعضكم البعض في الخير والتقوى. الله تعالى يرى كل عمل.
DeepL	اسلكوا طريق الصبر والامتنان. ساعدوا بعضكم البعض في فعل الخير والتقوى. الله تعالى يرى كل فعل.
GPT-4	اسلكوا سبيل الصبر والشكر. وتعاونوا على البرّ والتقوى. إن الله تعالى بصير بكل عمل تعملونه.
Claude	اتبعوا طريق الصبر والشكر. ساعدوا بعضكم البعض في الخير والتقوى. الله تعالى يرى كل عمل.
NLLB-200	اتبعوا طريق الصبر والشكر. ساعدوا بعضكم في الخير والتقوى. الله تعالى يرى كل عمل.

Assessment: GPT-4 excels with Quranic-style Arabic: “وتعاونوا على البرّ والتقوى” directly echoes Quran 5:2, using the same vocabulary and structure. “إن الله تعالى بصير بكل عمل تعملونه” uses the divine attribute “بصير” (All-Seeing), standard in Islamic discourse. DeepL’s “الامتنان” (gratitude) instead of “الشكر” (thankfulness/gratitude in Islamic context) misses the religious register. The shared Islamic vocabulary between Pashto and Arabic makes religious translation a strong domain for GPT-4 and NLLB-200. Translation Accuracy Leaderboard by Language Pair

Strengths and Weaknesses

Google Translate

Strengths: Free and accessible. Adequate for general content. Handles basic religious vocabulary. Weaknesses: Produces overly literal translations. Misses cultural context. Limited Pashto training data affects quality.

DeepL

Strengths: Clean Arabic output structure. Weaknesses: No native Pashto support. English pivot loses Islamic scholarly vocabulary. Translation errors on religious terms. Not recommended for this pair.

GPT-4

Strengths: Best contextual understanding. Excellent Islamic religious vocabulary. Handles humanitarian discourse well. Can match Arabic dialectal registers. Weaknesses: Higher cost. Occasional over-elaboration of source text.

Claude

Strengths: Consistent quality for long documents. Reliable for humanitarian reports. Balanced output. Weaknesses: Less natural than GPT-4 for religious content. Conservative approach to cultural adaptation.

NLLB-200

Strengths: Best automated metric scores due to dedicated Pashto-Arabic training path. Free and self-hostable. Strong direct translation without English pivot. Weaknesses: Limited register flexibility. No contextual reasoning. Basic output without scholarly depth.

Recommendations

Use Case	Recommended System
Islamic religious texts	GPT-4
Humanitarian aid documents	Claude or GPT-4
Migrant worker communications	NLLB-200 or Google Translate
Legal / asylum documents	Claude
High-volume translation	NLLB-200 (self-hosted)
Quick personal translation	Google Translate (free)
Educational materials	GPT-4

Best Translation AI in 2026: Complete Model Comparison

Key Takeaways

NLLB-200 achieves the highest automated scores for Pashto-to-Arabic due to dedicated direct training data, while GPT-4 leads on contextual quality, particularly for religious and humanitarian content.
The shared Arabic script and extensive Islamic vocabulary create a strong lexical bridge between Pashto and Arabic, making religious and formal content translation more accurate than colloquial speech.
DeepL’s lack of native Pashto support and reliance on English pivoting make it unsuitable for this language pair, where the direct Islamic scholarly connection between source and target languages is critical.
Large Pashtun diaspora communities in Gulf states and ongoing humanitarian needs in Afghanistan drive significant and growing demand for reliable Pashto-Arabic translation tools.

Next Steps

Try it yourself: Compare these systems on your own text in the Translation AI Playground: Compare Models Side-by-Side.
Related pair: See how systems handle Persian to Arabic translation.
Check the leaderboard: Browse our full Translation Accuracy Leaderboard by Language Pair.
Full model comparison: Read Best Translation AI in 2026: Complete Model Comparison.