Published 10/25/2023
Keywords
- language learning research,
- spoken learner corpus,
- longitudinal corpus,
- L2 Russian
How to Cite
Copyright (c) 2023 L'Analisi Linguistica e Letteraria
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Abstract
This article presents a new resource for language learning research, Russian Spoken Learner Corpus, created by a research team at the Department of Languages, Literatures, Cultures and Mediations of the University of Milan. The corpus contains longitudinal and quasi-longitudinal oral data produced by Italian learners of Russian from the A0→1 to the C1 levels. In the longitudinal part of the project, data collection is conducted 2 times a year among the same group of students during the three/five-year program of study. The quasi-longitudinal subcorpus includes data produced by students from the first to the fifth year of study. Alongside the learner data, two comparable reference subcorpora are compiled: a subcorpus containing interviews with native speakers of Russian and a subcorpus made of interviews with bilingual (Italian-Russian) speakers. The interviews are transcribed according to a set of explicit conventions. Each sound file and its transcript are linked to a questionnaire which contains metadata about the interviewee.