Exploiting Discourse Relations between Sentences for Text Clustering
نویسندگان
چکیده
Over the years, the usage of discourse relations has been proven to enhance many applications such as text summarization, question answering and natural language generation. This paper proposes an approach that expands the benefit of discourse relations for natural language processing from a different aspect. We exploit the discourse relations existing between sentences to generate clusters of similar sentences from document sets. We first examined and defined the type of discourse relations that useful to retrieve sentences with identical content. We then assigned these relations to each sentence pair using a machine learning method. Finally we performed discourse relation-based clustering algorithm to generate clusters of similar sentences. We evaluated our method by measuring the cohesion and separation of the clusters and compared to a well recognized clustering method. The experimental result shows that our method performed significantly well, which demonstrated that discourse relation between sentences can be exploited for text clustering.
منابع مشابه
The Prosody of Discourse Structure and Content in the Production of Persian EFL Learners
The present research addressed the prosodic realization of global and local text structure and content in the spoken discourse data produced by Persian EFL learners. Two newspaper articles were analyzed using Rhetorical Structure Theory. Based on these analyses, the global structure in terms of hierarchical level, the local structure in terms of the relative importance of text segments and the ...
متن کاملHybrid Approach to PDTB-styled Discourse Parsing for CoNLL-2015
This paper describes our end-to-end PDTB-styled discourse parser for the CoNLL-2015 shared task. We employed a machine learning-based approach to identify discourse relation between text spans for both explicit and implicit relations and employed a rule-based approach to extract arguments of the discourse relations. In particular, we focus on improving the implicit discourse relation identifica...
متن کاملExploiting Rhetorical Relations to Multiple Documents Text Summarization
Many of previous research have proven that the usage of rhetorical relations is capable to enhance many applications such as text summarization, question answering and natural language generation. This work proposes an approach that expands the benefit of rhetorical relations to address redundancy problem for cluster-based text summarization of multiple documents. We exploited rhetorical relati...
متن کاملUncovering Discourse Relations to Insert Connectives between the Sentences of an Automatic Summary
This paper presents a machine learning approach to find and classify discourse relations between two unseen sentences. It describes the process of training a classifier that aims to determine (i) if there is any discourse relation among two sentences, and, if a relation is found, (ii) which is that relation. The final goal of this task is to insert discourse connectives between sentences seekin...
متن کاملExploiting Discourse Analysis for Article-Wide Temporal Classification
In this paper we classify the temporal relations between pairs of events on an article-wide basis. This is in contrast to much of the existing literature which focuses on just event pairs which are found within the same or adjacent sentences. To achieve this, we leverage on discourse analysis as we believe that it provides more useful semantic information than typical lexico-syntactic features....
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013