Year 31 No. 2 (2023): Issue 2/2023

Проект устного учебного корпуса русского языка

Tatsiana Maiko
Università degli Studi di Milano

Published 10/25/2023


  • language learning research,
  • spoken learner corpus,
  • longitudinal corpus,
  • L2 Russian

How to Cite

Maiko, T. (2023). Проект устного учебного корпуса русского языка. L’Analisi Linguistica E Letteraria, 31(2), 25–38. Retrieved from


This article presents a new resource for language learning research, Russian Spoken Learner Corpus, created by a research team at the Department of Languages, Literatures, Cultures and Mediations of the University of Milan. The corpus contains longitudinal and quasi-longitudinal oral data produced by Italian learners of Russian from the A0→1 to the C1 levels. In the longitudinal part of the project, data collection is conducted 2 times a year among the same group of students during the three/five-year program of study. The quasi-longitudinal subcorpus includes data produced by students from the first to the fifth year of study. Alongside the learner data, two comparable reference subcorpora are compiled: a subcorpus containing interviews with native speakers of Russian and a subcorpus made of interviews with bilingual (Italian-Russian) speakers. The interviews are transcribed according to a set of explicit conventions. Each sound file and its transcript are linked to a questionnaire which contains metadata about the interviewee.