Author
Listed:
- Svetlana Yu. Toldova
(National Research University Higher School of Economics)
- Elizaveta I. Ivtushok
(National Research University Higher School of Economics)
- Kira M. Shulgina
(National Research University Higher School of Economics)
- Mira B. Bergelson
(National Research University Higher School of Economics)
- Mariya V. Khudyakova
(National Research University Higher School of Economics)
Abstract
This work is devoted to the distribution of different referential devices in spoken discourse produced by healthy speakers and people with aphasia and its comparison to written discourse. We discuss some special annotation issues for the corpus of Pear film retellings (Russian CliPS) by people with aphasia (PWA), right hemisphere damage (RHD), and healthy speakers (HP for healthy people) of Russian. The study summarizes the comprehensive annotation schema developed for this task and the preliminary research of the referential choice features based on the corpus. Comparing retellings and written texts, we found a significant difference in the use of basic coreferential expressions between the two. Firstly, there is a significant difference in the distribution of basic NP types. Speakers use reduced devices such as zero anaphora or bare nouns in retellings more frequently than in written texts. There are also differences in the distribution of more granulated features such as the word order within an NP, the use of anaphoric and reduced expressions (demonstratives or zero NPs) for the first mention of an entity, and the inclusion of epistemic markers into NPs. We also found that the retellings produced by PWA and HP do not differ much in terms of the distribution of basic NP types. However, a detailed analysis within different NP types and taking into consideration various disfluencies reveals some prominent differences between the two populations. These include a difference in zero subject distribution, the frequency of non-referential NP links, the frequency of co-reference errors. While adapting the initial coreference annotation scheme we concluded that besides referential ambiguity, which is normally taken into account in spoken discourse analysis, and basic taxonomy of the referential devices (full NP vs. anaphoric pronoun vs. anaphoric zero), other features need to be considered
Suggested Citation
Svetlana Yu. Toldova & Elizaveta I. Ivtushok & Kira M. Shulgina & Mira B. Bergelson & Mariya V. Khudyakova, 2016.
"Coreference Annotation in the Russian Clinical Pear Stories Corpus: Annotation Features and Preliminary Results,"
HSE Working papers
WP BRP 50/LNG/2016, National Research University Higher School of Economics.
Handle:
RePEc:hig:wpaper:50/lng/2016
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:hig:wpaper:50/lng/2016. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Shamil Abdulaev or Shamil Abdulaev (email available below). General contact details of provider: https://edirc.repec.org/data/hsecoru.html .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.