Tatar language

татар теле
Native to Russia, other post-Soviet states
Ethnicity Tatars
Native speakers
5.2 million (2015)[1]
(may include some L2 speakers)
Official status
Official language in


Regulated by Institute of Language, Literature and Arts of the Academy of Sciences of the Republic of Tatarstan
Language codes
ISO 639-1 tt
ISO 639-2 tat
ISO 639-3 tatinclusive code
Individual code:
sty  Siberian Tatar
Glottolog tata1255[2]
Linguasphere 44-AAB-be

The Tatar language (Tatar: татар теле, татарча) is a Turkic language spoken by Tatars mainly located in modern Tatarstan, Bashkortostan (European Russia), as well as Siberia. It should not be confused with the Crimean Tatar language which is closely related, but belongs to another, the Cuman subgroup of the Kipchak languages.

Geographic distribution

The Tatar language is spoken in Russia (about 5.3 million people), Ukraine, China, Finland, Turkey, Uzbekistan, the United States of America, Romania, Azerbaijan, Israel, Kazakhstan, Georgia, Lithuania, Latvia, and other countries. There are more than 7 million speakers of Tatar in the world.

Tatar is also native for several thousand Maris. Mordva's Qaratay group also speak a variant of Kazan Tatar.

In the 2010 census, 69% of Russian Tatars who responded to the question about language ability claimed a knowledge of the Tatar language.[3] In Tatarstan, 93% of Tatars and 3.6% of Russians did so. In neighbouring Bashkortostan, 67% of Tatars, 27% of Bashkirs, and 1.3% of Russians did.[4]

Official status

Tatar, along with Russian, is the official language of the Republic of Tatarstan. The official script of Tatar language is based on the Cyrillic script with some additional letters. The Republic of Tatarstan passed a law in 1999, which came into force in 2001, establishing an official Tatar Latin alphabet. A Russian federal law overrode it in 2002, making Cyrillic the sole official script in Tatarstan since. Unofficially, other scripts are used as well, mostly Latin and Arabic. All official sources in Tatarstan must use Cyrillic on their websites and in publishing. In other cases, where Tatar has no official status, the use of a specific alphabet depends on the preference of the author.

The Tatar language was made a de facto official language in Russia in 1917, but only within the Tatar Autonomous Soviet Socialist Republic. Tatar is also considered to have been the official language in the short-lived Idel-Ural State, briefly formed during the Russian Civil War.

The usage of Tatar declined from during the 20th century. By the 1980s, the study and teaching of Tatar in the public education system was limited to rural schools. However, Tatar-speaking pupils had little chance of entering university because higher education was available in Russian almost exclusively.

As of 2001 Tatar was considered a potentially endangered language while Siberian Tatar received "endangered" and "seriously endangered" statuses, respectively.[5] Higher education in Tatar can only be found in Tatarstan, and is restricted to the humanities. In other regions Tatar is primarily a spoken language and the number of speakers as well as their proficiency tends to decrease. Tatar is popular as a written language only in Tatar-speaking areas where schools with Tatar language lessons are situated. On the other hand, Tatar is the only language in use in rural districts of Tatarstan.

Since 2017 Tatar language classes are no longer mandatory in the schools of Tatarstan.[6] According to the opponents of this change, it will further endanger the Tatar language and is a violation of the Tatarstan Constitution which stipulates the equality of Russian and Tatar languages in the republic.[7][8]

Dialects of Tatar

There are two main dialects of Tatar:

  • Central or Middle (Kazan)
  • Western (Mişär or Mishar)

All of these dialects also have subdivisions. Significant contribution to the study of the Tatar language and its dialects, were made by a scientist Gabdulkhay Akhatov, who is considered to be the founder of the modern Tatar dialectological school.

Spoken idioms of Siberian Tatars, which differ significantly from the above two, are often considered as the third dialect group of Tatar by some, but as an independent language on its own by others.

Central or Middle

The Central or Middle dialectal group is spoken in Kazan and most of Tatarstan and is the basis of the standard literary Tatar language.


In the Western (Mişär) dialect ç is pronounced [] (southern or Lambir Mişärs) and as [ts] (northern Mişärs or Nizhgars). C is pronounced []. There are no differences between v and w, q and k, g and ğ in the Mişär dialect. (The Cyrillic alphabet doesn't have special letters for q, ğ and w, so Mişär speakers have no difficulty reading Tatar written in Cyrillic.)

This is the dialect spoken by the Tatar minority of Finland.

Siberian Tatar

Two main isoglosses that characterize Siberian Tatar are ç as [ts] and c as [j], corresponding to standard [ɕ] and [ʑ]. There are also grammatical differences within the dialect, scattered across Siberia.[9]

Many linguists claim the origins of Siberian Tatar dialects are actually independent of Volga–Ural Tatar; these dialects are quite remote both from Standard Tatar and from each other, often preventing mutual comprehension. The claim that this language is part of the modern Tatar language is typically supported by linguists in Kazan and denounced by Siberian Tatars.

Over time, some of these dialects were given distinct names and recognized as separate languages (e.g. the Chulym language) after detailed linguistic study. A brief linguistic analysis shows that many of these dialects exhibit features which are quite different from the Volga–Ural Tatar varieties, and should be classified as Turkic varieties belonging to several sub-groups of the Turkic languages, distinct from Kipchak languages to which Volga–Ural Tatar belongs.



There exist several interpretations of the Tatar vowel phonemic inventory. In total Tatar has nine or ten native vowels, and three or four loaned vowels (mainly in Russian loanwords).[10][11]

According to Baskakov (1988) Tatar has only two vowel heights, high and low. There are two low vowels, front and back, while there are eight high vowels: front and back, round (R+) and unround (R-), normal and short (or reduced).[10]

Front Back
R- R+ R- R+
High Normal i ü ï u
Short e ö ë o
Low ä a

Poppe (1963) proposed a similar yet slightly different scheme with a third, higher mid, height, and with nine vowels.[10]

Front Back
R- R+ R- R+
High i ü u
Higher Mid e ö ï o
Low ä a

According to Makhmutova (1969) Tatar has three vowel heights: high, mid and low, and four tongue positions: front, front-central, front-back and back.[10]

Front Central Back
Front Back
R- R+ R- R+ R- R+ R- R+
High i ü ï u
Mid e ö ë o
Low ä a

The mid back unrounded vowel ''ë is usually transcribed as ı, though it differs from the corresponding Turkish vowel.

The tenth vowel ï is realized as the diphthong ëy (IPA: [ɯɪ]), which only occurs word-finally, but it has been argued to be an independent phoneme.[10][11]

Phonetically, the native vowels are approximately thus (with the Cyrillic letters and the usual Latin romanization in angle brackets):

Front Back
R- R+ R- R+
High и i
ү ü
ый ıy
у u
Mid э, е e
ө ö
ы ı
о o
Low ә ä
а a

In polysyllabic words, the front-back distinction is lost in reduced vowels: all become mid-central.[10] The mid reduced vowels in an unstressed position are frequently elided, as in кеше keşe [kĕˈʃĕ] > [kʃĕ] 'person', or кышы qışı [qɤ̆ˈʃɤ̆] > [qʃɤ̆] '(his) winter'..[11] Low back /ɑ/ is rounded [ɒ] in the first syllable and after [ɒ], but not in the last, as in бала bala [bɒˈlɑ] 'child', балаларга balalarğa [bɒlɒlɒrˈʁɑ] 'to children'.[11] In Russian loans there are also [ɨ], [ɛ], [ɔ], and [ä], written the same as the native vowels: ы, е/э, о, а respectively.[11]

Historical shifts

Historically, the Old Turkic mid vowels have raised from mid to high, whereas the Old Turkic high vowels have become the Tatar reduced mid series. (The same shifts have also happened in Baskir.)[12]

Vowel Old Turkic Turkish Kazakh Tatar Bashkir Gloss
*e *et et et it it 'meat'
*söz söz söz süz hüź [hyθ] 'word'
*o *sol sol sol sul hul 'left'
*i *it it it et et 'dog'
*qïz kız qız qëz [qɤ̆z] qëź [qɤ̆θ] 'girl'
*u *qum kum qum qom qom 'sand'
*kül kül kül köl köl 'ash'


The consonants of Tatar[11]
Labial Labio-
Dental Post-
Palatal Velar Uvular Glottal
Nasals м m
н n
ң ñ
Plosives Voiceless п p
т t
к k
къ q
Voiced б b
д d
г g
Affricates Voiceless ц ts
ч ç
Voiced җ c
Fricatives Voiceless ф f
с s
ш ş
ч ç
х x
һ h
Voiced в v
з z
ж j
җ c
гъ ğ
Trill р r
Approximants у/ү/в w
л l
й y
^* The phonemes /v/, /ts/, //, /ʒ/, /h/, /ʔ/ are only found in loanwords. /f/ occurs more commonly in loanwords, but is also found in native words, e.g. yafraq 'leaf'.[11] /v/, /ts/, //, /ʒ/ may be substituted with the corresponding native consonants /w/, /s/, /ɕ/, /ʑ/ by some Tatars.
^† // and // are the dialectal Western (Mişär) pronunciations of җ c /ʑ/ and ч ç /ɕ/, the latter are in the literary standard and in the Central (Kazan) dialect. /ts/ is the variant of ч ç /ɕ/ as pronounced in the Eastern (Siberian) dialects and some Western (Mişär) dialects. Both // and /ts/ are also used in Russian loanwords (the latter written ц).
^‡ /q/ and /ʁ/ are usually considered allophones of /k/ and /ɡ/ in the environment of back vowels, so they never written in the Tatar Cyrillic orthography in native words, and only rarely in loanwords with къ and гъ. However, /q/ and /ʁ/ also appear before front /æ/ in Perso-Arabic loanwords which may indicate the phonemic status of these uvular consonants.


Tatar consonants usually undergo slight palatalization before front vowels. However, this allophony is not significant and does not constitute a phonemic status. This differs from Russian where palatalized consonants are not allophones but phonemes on their own. There are a number of Russian loanwords which have a palatalized consonants in Russian and thus written the same in Tatar (often with the "soft sign" ь). The Tatar standard pronunciation also requires palatalization in such loanwords, however, some Tatar may pronounce them non-palatalized.


In native words there are six types of syllables (Consonant, Vowel, Sonorant):

  • V (ı-lıs, u-ra, ö-rä)
  • VC (at-law, el-geç, ir-kä)
  • CV (qa-la, ki-ä, su-la)
  • CVC (bar-sa, sız-law, köç-le, qoş-çıq)
  • VSC (ant-lar, äyt-te, ilt-kän)
  • CVSC (tört-te, qart-lar, qayt-qan)

Loanwords allow other types: CSV (gra-mota), CSVC (käs-trül), etc.


Stress is usually on the final syllable. However, some suffixes cannot be stressed, so the stress shifts to the syllable before that suffix, even if the stressed syllable is the third or fourth from the end. A number of Tatar words and grammatical forms have the natural stress on the first syllable. Loanwords, mainly from Russian, usually preserve their original stress (unless the original stress is on the last syllable, in such a case the stress in Tatar shifts to suffixes as usual, e.g. sovét > sovetlár > sovetlarğá).

Phonetic alterations

Tatar phonotactics dictate many pronunciation changes which are not reflected in the orthography.

  • Unrounded vowels ı and e become rounded after o or ö:
коры/qorı > [qoro]
борын/borın > [boron]
көзге/közge > [közgö]
соры/sorı > [soro]
унбер/unber > [umber]
менгеч/mengeç > [meñgeç]
  • Stops are assimilated to the preceding nasals (this is reflected in writing):
урманнар/urmannar ( < urman + lar)
комнар/komnar ( < kom + lar)
күзсез/küzsez > [küssez]
урыны/urını> [urnı]
килене/kilene > [kilne]
кара урман/qara urman > [qarurman]
килә иде/kilä ide > [kiläyde]
туры урам/turı uram > [tururam]
була алмын/bula almım > [bulalmım]
банк/bank > [bañqı]
артист/artist > [artis]
табиб/tabib > [tabip]


Like other Turkic languages, Tatar is an agglutinative language.

Grammatical case:



  • After vowels, consonants, hard: -lar (bala-lar, abí-lar, kitap-lar, qaz-lar, malay-lar, qar-lar, ağaç-lar)
  • After vowels, consonants, soft: -lär (äni-lär, sölge-lär, däftär-lär, kibet-lär, süz-lär, bäbkä-lär, mäktäp-lär, xäref-lär)
  • After nasals, hard: -nar (uram-nar, urman-nar, tolım-nar, moñ-nar, tañ-nar, şalqan-nar)
  • After nasals, soft: -när (ülän-när, keläm-när, çräm-när, iñ-när, ciñ-när, isem-när)

Declension of pronouns

Personal pronouns
Nominativeмин minсин sinул ulбез bezсез sezалар alar
Genitiveминем minemсинең sineñаның anıñбезнең bezneñсезнең sezneñаларның alarnıñ
Dativeмиңа miñaсиңа siñaаңа añaбезгә bezgäсезгә sezgäаларга alarğa
Accusativeмине mineсине sineаны anıбезне bezneсезне sezneаларны alarnı
Locativeминдә mindäсиндә sindäанда andaбездә bezdäсездә sezdäаларда alarda
Ablativeминнән minnänсиннән sinnänаннан annanбездән bezdänсездән sezdänалардан alardan
Demonstrative pronouns
Nominativeбу buшул şulболар bolarшулар şular
Genitiveмоның monıñшуның şunıñболарның bolarnıñшуларның şularnıñ
Dativeмоңа moñaшуңа şuñaболарга bolarğaшуларга şularğa
Accusativeмоны monıшуны şunıболарны bolarnıшуларны şularnı
Locativeмонда mondaшунда şundaболарда bolardaшуларда şularda
Ablativeмоннан monnanшуннан şunnanболардан bolardanшулардан şulardan
Interrogative pronouns
Nominativeкем kemнәрсә närsä
Genitiveкемнең kemneñнәрсәнең närsäneñ
Dativeкемгә kemgäнәрсәгә närsägä
Accusativeкемне kemneнәрсәне närsäne
Locativeкемдә kemdäнәрсәдә närsädä
Ablativeкемнән kemnänнәрсәдән närsädän


Writing system

During its history Tatar has been written in Arabic, Latin and Cyrillic scripts.

Before 1928, Tatar was mostly written with in Arabic script (Иске имля/İske imlâ, "Old orthography", to 1920; Яңа имла/Yaña imlâ, "New orthography", 1920–1928).

During the 19th century Russian Christian missionary Nikolay Ilminsky devised the first Cyrillic alphabet for Tatar. This alphabet is still used by Christian Tatars (Kryashens).

In the Soviet Union after 1928, Tatar was written with a Latin alphabet called Jaᶇalif.

In 1939, in Tatarstan and all other parts of the Soviet Union a Cyrillic script was adopted and is still used to write Tatar. It is also used in Kazakhstan.

The Republic of Tatarstan passed a law in 1999 that came into force in 2001 establishing an official Tatar Latin alphabet. A Russian federal law overrode it in 2002, making Cyrillic the sole official script in Tatarstan since. In 2004, an attempt to introduce a Latin-based alphabet for Tatar was further abandoned when the Constitutional Court ruled that the federal law of 15 November 2002 mandating the use of Cyrillic for the state languages of the republics of the Russian Federation[14] does not contradict the Russian constitution.[15] In accordance with this Constitutional Court ruling, on 28 December 2004, the Tatar Supreme Court overturned the Tatarstani law that made the Latin alphabet official.[16]

In 2012 the Tatarstan government adopted a new Latin alphabet but with the limited usage (mostly for Romanization).

  • Tatar Cyrillic alphabet (1939; the letter order adopted in 1997):
А а Ә ә Б б В в Г г Д д Е е Ё ё
Ж ж Җ җ З з И и Й й К к Л л М м
Н н Ң ң О о Ө ө П п Р р С с Т т
У у Ү ү Ф ф Х х Һ һ Ц ц Ч ч Ш ш
Щ щ Ъ ъ Ы ы Ь ь Э э Ю ю Я я
  • Tatar Old Cyrillic alphabet (by Nikolay Ilminsky, 1861; the letters in parenthesis are not used in modern publications):
А а Ӓ ӓ Б б В в Г г Д д Е е Ё ё Ж ж З з
И и (Іі) Й й К к Л л М м Н н Ҥ ҥ О о Ӧ ӧ
П п Р р С с Т т У у Ӱ ӱ Ф ф Х х Ц ц Ч ч
Ш ш Щ щ Ъ ъ Ы ы Ь ь (Ѣѣ) Э э Ю ю Я я (Ѳѳ)
  • 1999 Tatar Latin alphabet, made official by a law adopted by Tatarstani authorities but annulled by the Tatar Supreme Court in 2004:[16]
A a Ə ə B b C c Ç ç D d E e F f
G g Ğ ğ H h I ı İ i J j K k Q q
L l M m N n Ꞑ ꞑ O o Ɵ ɵ P p R r
S s Ş ş T t U u Ü ü V v W w X x
Y y Z z
  • 2012 Tatar Latin alphabet
A a Ä ä B b C c Ç ç D d E e F f
G g Ğ ğ H h I ı İ i J j K k Q q
L l M m N n Ñ ñ O o Ö ö P p R r
S s Ş ş T t U u Ü ü V v W w X x
Y y Z z
  • Tatar Arabic alphabet (before 1928):
آ ا ب پ ت ث ج چ
ح خ د ذ ر ز ژ س
ش ص ض ط ظ ع غ ف
ق ك گ نك ل م ن ه
و ۇ ڤ ی ئ


Tatar's ancestors are the extinct Bulgar and Kipchak languages.

The literary Tatar language is based on the Middle Tatar dialect and on the Old Tatar language (İske Tatar Tele). Both are members of the Volga-Ural subgroup of the Kipchak group of Turkic languages, although they also partly derive from the ancient Volga Bulgar language.

Most of the Uralic languages in the Volga River area have strongly influenced the Tatar language,[17] as have the Arabic, Persian and Russian languages.[18]

Crimean Tatar, although similar by name, belongs to another subgroup of the Kipchak languages, usually called Pontic, Cuman or Polovtsian. Unlike Kazan Tatar, Crimean Tatar is heavily influenced by Turkish.


Universal Declaration of Human Rights, Article 1:


Барлык кешеләр дә азат һәм үз абруйлары һәм хокуклары ягыннан тиң бупып туапар. Аларга акыл һәм вөҗдан бирелгән һәм бер-берсенә карата туганнарча мөнасәбәттә булырга тиешләр.


Barlıq keşelär dä azat häm üz abruyları häm xoquqları yağınnan tiñ bulıp tualar. Alarğa aqıl häm wöcdan birelgän häm ber-bersenä qarata tuğannarça mönasäbättä bulırğa tieşlär

