Lexical Models to Identify Unmarked Discourse Relations: Does WordNet help?
نویسنده
چکیده
In this paper, we address the task of automatically determining which discourse relation holds between two text spans. We focus on relations that are not explicitly signalled by a discourse marker like but. While lexical models have been found useful for the task, they are also prone to data sparseness problems, which is a big drawback given the scarcity of discourse annotated data. We therefore investigate whether the use of lexical-semantic resources, such as WordNet, can be exploited to back-off to a more general representation of lexical information in cases were data are sparse. We compare such a semantic back-off strategy to morphological generalisations over word forms, such as stemming and lemmatising.
منابع مشابه
Harald Lüngen , Alexander Mehler and Angelika Storrer Lexical - Semantic Resources in Automated Discourse Analysis
In this paper, we address the task of automatically determining which discourse relation holds between two text spans. We focus on relations that are not explicitly signalled by a discourse marker like but. While lexical models have been found useful for the task, they are also prone to data sparseness problems, which is a big drawback given the scarcity of discourse annotated data. We therefor...
متن کاملAutomatic Identification of AltLexes using Monolingual Parallel Corpora
The automatic identification of discourse relations is still a challenging task in natural language processing. Discourse connectives, such as since or but, are the most informative cues to identify explicit relations; however discourse parsers typically use a closed inventory of such connectives. As a result, discourse relations signaled by markers outside these inventories (i.e. AltLexes) are...
متن کاملAutomatic Construction of Persian ICT WordNet using Princeton WordNet
WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...
متن کاملLexical Marking And The Recovery Of Discourse Structure
In the theory presented here, discourse relations are equated with coherence relations. The relata are taken to be sets of events or entities introduced into the discourse, as in SDRT (Asher, 1993). Our empirical studies of commentary, narrative and news texts have shown that coherence relations are frequently signaled syntactically or semantically rather than lexically. In a full natural langu...
متن کاملA Predictive Approach to the Analysis of Intonation in Discourse
This paper presents an approach to the analysis of prosody in French discourse, based on a prediction of the unmarked default intonation matching the lexical and syntactic properties of the sequence. By confronting this default intonation with the actual intonation used by the speaker, all marked intonation elements can be identified. The prediction of default intonation takes into account word...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- JLCL
دوره 23 شماره
صفحات -
تاریخ انتشار 2008