The creation of the National Corpus of the Crimean Tatar Language continues

Опубліковано 5 March 2023 року, о 12:30

The Ministry of Reintegration has initiated the creation of the National Corpus of the Crimean Tatar Language as part of the implementation of the Crimean Tatar Language Development Strategy for 2022-2032. This is an online platform for language research that will work on data from textual materials in Crimean Tatar.

The process of collecting printed and online sources in the Crimean Tatar language to create the National Corpus has been lasting for 4 months.

During this time, 675 materials by more than 180 authors have been processed and included in the catalogue. This is more than 50 thousand printed pages (40 million characters). Among them are works by famous authors, newspapers, magazines, textbooks, scientific articles, international legal documents, etc.

Currently, the oldest work dates back to the 13th century, and the most modern one is from the 21st century (2023).

The catalogue already contains materials in 4 graphic systems used in the Crimean Tatar language, namely, Arabic script, pre-war Latin, Cyrillic, and modern Latin.

Anyone can join the creation of the online database of texts by following the link.

The project is being implemented with the support of the Ministry of Reintegration, the Swiss-Ukrainian EGAP Program implemented by the East Europe Foundation, and Taras Shevchenko National University of Kyiv.