Dialects

Iraqi Arabic call transcription

Mesopotamian Arabic — distinct from Levantine, with a large GCC diaspora that matters for compliance and legal buyers.

Last updated: April 2026

Iraqi Arabic — the Mesopotamian dialect cluster — is treated by most ASR systems as a Levantine sub-variety, which is wrong. Mesopotamian Arabic is a distinct Arabic group with its own phonological inventory, lexical strata from Aramaic, Persian, and Turkish, and morphological patterns that no Levantine model decodes correctly. For GCC operations, Iraqi Arabic matters because the Iraqi diaspora across the UAE, Saudi Arabia, Jordan, and Qatar is large enough to register on operational call traffic — particularly in oil and gas services, legal compliance, and humanitarian/government-services flows. CallScribe ships an Iraqi-specific acoustic adapter rather than misrouting Baghdadi callers to the Levantine model.

Mesopotamian phonology — gilit vs. qeltu

Iraqi Arabic splits along the classical gilit/qeltu isogloss. Southern Iraqi (Baghdad, Basra, the marshlands) belongs to the gilit group — named after the realisation of "I said" as "gilit" rather than "qultu". This group realises MSA ق (qāf) consistently as [ɡ]: "gāl" not "qāl", "gabel" not "qabla". Northern Iraqi (Mosul, parts of Kirkuk, the Christian and Jewish dialects historically associated with these cities) belongs to the qeltu group, preserving [q] and using "qeltu" for "I said". A single Iraqi-tagged project may receive audio from both groups, and the acoustic model needs to handle both without forcing a choice upfront.

Other diagnostic features: ك (kāf) → [t͡ʃ] in front-vowel environments is shared with Khaleeji (compare [/dialects/khaleeji](/dialects/khaleeji)) and is one reason Iraqi audio is sometimes mistaken for Gulf rather than Levantine. The voiceless emphatics (ص ض ط ظ) are pharyngealised more strongly in Iraqi than in Mashriqi varieties, with broader vowel-colouring effects on neighbouring vowels. Iraqi also preserves the ث/ذ/ظ interdental series in many words where Levantine merges them with [t/d/z] — a small but operationally meaningful difference for transcription orthography.

Lexical strata: Aramaic, Persian, Turkish, English

Iraqi Arabic carries an Aramaic substrate denser than any other Mashriqi variety — Mesopotamia was Aramaic-speaking for over a millennium before Arabization, and substrate vocabulary persists in everyday domains (kinship, agriculture, religious terminology). Above that sits a Persian superstrate from the Sassanid and later Safavid periods (especially in Basra and southern Iraq) and a Turkish superstrate from Ottoman administration (especially in Mosul and Baghdad). Words like "khōsh" (good, from Persian), "bābūj" (slipper, from Turkish), "qaṣṣāb" (butcher, broader Mashriqi but with Iraqi-specific frequency) appear at rates that distinguish Iraqi from neighbouring Levantine.

Modern Iraqi Arabic also code-switches with English at high rates among educated and diaspora speakers — particularly Iraqis who left for the UK, US, or Gulf states after 2003 and now form the operational diaspora cohort GCC contact centres encounter. The English code-switching pattern differs from Lebanese-French — it is technical and professional vocabulary rather than social-register markers — and CallScribe's Iraqi model is bilingual-ready at the segment level with Arabic in Arabic script and English in Latin script.

Morphology that breaks Levantine-tuned models

Iraqi future is marked with رح (raḥ) like Levantine but more often with حـ (ḥa-) prefix or with the standalone particle داـ (da-) in present continuous: "da-aktib" (I am writing) — a construction that does not exist in Levantine or Khaleeji. Iraqi negation uses ما (mā) for verbs and مو (mū) for nominal predicates, similar to Khaleeji rather than Levantine "miš". Demonstratives use forms like "hāða", "haay", "ðōl", "ðiči" — distinct from both Levantine "hayda/hayde" and Khaleeji "haːða".

Object-pronoun cliticisation in Iraqi is denser than in any other Mashriqi variety: "shifit-ak-yāha" (I saw it for you) compresses three morphemes; "ʕaṭē-ni-yāh" (he gave it to me) is routine. An acoustic model that has not seen these compressions in training treats them as transcription errors. CallScribe's Iraqi adapter is trained on Iraqi telephony specifically, including diaspora speech where compression is preserved even after years abroad.

Why Iraqi matters for GCC compliance and legal buyers

The UAE hosts a substantial Iraqi expatriate community — government estimates and embassy figures place the Iraqi resident population in the UAE in the range of 500,000 across Dubai, Abu Dhabi, and Sharjah, with concentrations in oil-services, construction, and trading sectors. Saudi Arabia, Jordan, and Qatar host smaller but significant Iraqi communities. For compliance buyers — the audience for [/industries/legal](/industries/legal) and our compliance use-cases — Iraqi audio is non-trivial: AML-related call reviews, immigration/visa interviews, dispute-resolution proceedings under DIFC and ADGM frameworks, and humanitarian-organisation hotlines all surface Iraqi-dialect content that mainstream Arabic ASR mishandles.

Legal-services firms in the UAE and Bahrain handling Iraqi-corporate matters (oil-services contracts, sanctions-compliance reviews, cross-border arbitration involving Iraqi parties) generate Arabic-language audio that needs accurate transcription for evidentiary purposes. Generic Arabic ASR producing 25-35% WER on Iraqi audio is not fit for legal use; CallScribe's Iraqi adapter brings WER into the 12-18% range on clear telephony, which is the threshold below which transcript review becomes faster than re-listening to audio.

CallScribe accuracy and limits on Iraqi

Word-error rate on our internal Iraqi telephony benchmark is 12-18% for clear Baghdadi or Basran (gilit) audio with single-speaker channels, 14-20% for Mosuli (qeltu) audio where training data is sparser, and 18-26% for noisy multi-party calls with English code-switching. Diaspora speakers (UAE/UK/US-based Iraqis) are well-represented in our training data because they are operationally relevant; in-Iraq speakers from less-recorded regions (Anbar, southern marshlands rural varieties) sit at the higher end of the WER range. The Iraqi country tag (IQ) biases the lexicon toward Iraqi-specific vocabulary; sub-regional tags are not currently exposed but may be added based on customer demand.

At a glance

  • 12-18% WER on clear Iraqi telephony
  • Both gilit (Baghdad/Basra) and qeltu (Mosul) groups handled
  • Diaspora speech well-represented in training
  • Aramaic/Persian/Turkish substrate vocabulary supported
  • Suitable for legal/compliance evidentiary use

FAQs

Is Iraqi Arabic actually distinct from Levantine?

Yes — it is a separate Arabic group (Mesopotamian) with its own phonology, lexicon, and morphology. Routing Iraqi audio to a Levantine model produces 25-35% WER versus 12-18% on a properly tuned Iraqi adapter. The frequent assumption that "all Mashriqi Arabic is similar" breaks down operationally on Iraqi content. See [/dialects/levantine](/dialects/levantine) for what we explicitly do not lump in.

Does CallScribe distinguish Baghdadi from Mosuli audio?

The acoustic model handles both gilit (southern, Baghdad/Basra) and qeltu (northern, Mosul) varieties without requiring an upfront sub-regional tag. The Iraqi country tag (IQ) biases the language model lexicon; if you have predominantly Mosuli traffic, contact us for sub-regional tuning.

How does CallScribe handle Iraqi-English code-switching for diaspora speakers?

Diaspora Iraqis frequently code-switch with English, particularly on professional and technical topics. CallScribe handles the switch at the segment level — Arabic in Arabic script, English in Latin script — with no transliteration loss. Numbers are transcribed in the language spoken with optional downstream normalisation.

Is Iraqi suitable for legal evidentiary transcripts?

For UAE, DIFC, ADGM, and similar legal contexts, our Iraqi adapter produces transcripts of sufficient fidelity for evidentiary review when audio quality is reasonable. We recommend human review for any transcript intended as primary evidence rather than as a search aid; our output includes per-segment confidence scores so reviewers can prioritise low-confidence passages. See [/industries/legal](/industries/legal) for the broader workflow.

Does CallScribe handle Iraqi sentiment differently from Khaleeji?

Yes — the Iraqi sentiment lexicon recognises Iraqi-specific intensifiers, religious-register frustration markers, and prosodic patterns that differ from Gulf norms. Generic English-trained sentiment scoring misclassifies frustrated Iraqi speakers as neutral at high rates; our dialect-specific scoring corrects for this.

Can I lock a project to Iraqi to skip dialect detection?

Yes — set the project dialect to "iraqi" with country tag IQ, and the dialect-ID probe is bypassed. This is recommended for projects with predominantly Iraqi traffic (Iraqi-corporate legal review, Iraqi-diaspora outreach campaigns, humanitarian-org hotlines). Mixed-traffic projects should leave auto-detection on.

Try CallScribe free →

5 min/mo free · No credit card · 8-12% WER on Khaleeji

More dialects

View all