Psycholinguists build eye-tracking database on reading in Russian
Researchers from the Higher School of Economics, St. Petersburg State University, and the University of Potsdam have created the first ever database comprised of eye-tracking data collected during reading in Russian. The results are openly available and can be used not only in linguistics, but also in the diagnosis and correction of speech disorders, for example. The research was published in the journal Behavior Research Methods
It is well known that while reading, the eyes jump from word to word with short, roughly 220 ms, fixations. Visual information is processed only during fixations. If a word is short or predictable, the eye recognizes it using peripheral vision and skips it, and when reading in alphabetic languages, the eyes skip nearly 30% of words. What's interesting is that in different languages, reading sentences containing the exact same meaning takes a similar amount of time. Texts in a language like Finnish, however, require many short fixations, but in Chinese, there are fewer fixations that last for longer.
The latest psycholinguistic research has shown that the process of reading in different languages does not differ fundamentally. Researchers are nonetheless interested in the characteristics of individual languages. The special properties of eye movements while reading texts in Russian has not been studied before.
'Until now we did not know anything about reading in Russian, even though it is the sixth most widely spoken language in the world. This is why we carried out this basic, yet necessary, work. We took a set of different sentences from existing texts and recorded how native Russian speakers read them,' explains Anna Laurinavichyute, a research fellow in the HSE Centre for Language and the Brain. 'This is a benchmark for comparison with other languages, on the one hand, and on the other for comparison with children who are just learning to read, Russian Sign Language speakers, bilinguals, the elderly, and patients with aphasia (a speech disorder caused by stroke or head trauma)'.
To carry out the experiment, researchers used an eye tracker that can record 1,000 frames per second. The 96 subjects read the same set of sentences randomly chosen from the Russian National Corpus, which is the most representative online database of Russian texts used by linguists. It was important for the researchers to observe how reading speed changes depending on a word's characteristics. This is why the sentences were annotated for word's stress, part of speech, polysemy, usage frequency, length, and predictability.
The database can also be used to study how information is perceived in different languages. For example, the results of the study confirm the hypothesis that verbs are read more slowly than nouns are. And in this particular case, the reason does not have to do with the length of the words, their predictability, or other parameters. When encountering a noun the reader has to correlate a word and an object, but things are more complicated with verbs. First one has to understand what action this word is signifying before remembering who or what is performing the action. Only then does the reader determine if this action has an object, and if it does then in exactly which case the object is used.
Eye-tracking data has great potential. By knowing how a healthy individual reads, a system can be created to diagnose and correct dyslexia or to help regain speech after a head injury. Another area where this could be useful is in determining one's fluency in Russian by observing eye movement while reading. Linguists abroad are carrying out projects like this.