Centrality Measures for Non-Contextual Graph-Based Unsupervised Single Document Keyword Extraction
نویسنده
چکیده
The manner in which keywords fulfill the role of being central to a document is frustratingly still an open question. In this paper, we hope to shed some light on the essence of keywords in scientific articles and thereby motivate the graph-based approach to keyword extraction. We identify the document model captured by the text graph generated as input to a number of centrality metrics, and overview what these metrics say about keywords. In doing so, we achieve state-of-the-art results in unsupervised non-contextual single document keyword extraction.
منابع مشابه
Effects of Graph Generation for Unsupervised Non-Contextual Single Document Keyword Extraction
Abstract. This paper presents an exhaustive study on the generation of graph input to unsupervised graph-based non-contextual single document keyword extraction systems. A concrete hypothesis on concept coordination for documents that are scientific articles is put forward, consistent with two separate graph models : one which is based on word adjacency in the linear text–an approach forming th...
متن کاملKeyword and Keyphrase Extraction Using Centrality Measures on Collocation Networks
Keyword and keyphrase extraction is an important problem in natural language processing, with applications ranging from summarization to semantic search to document clustering. Graph-based approaches to keyword and keyphrase extraction avoid the problem of acquiring a large in-domain training corpus by applying variants of PageRank algorithm on a network of words. Although graph-based approache...
متن کاملA Graph Degeneracy-based Approach to Keyword Extraction
We operate a change of paradigm and hypothesize that keywords are more likely to be found among influential nodes of a graph-ofwords rather than among its nodes high on eigenvector-related centrality measures. To test this hypothesis, we introduce unsupervised techniques that capitalize on graph degeneracy. Our methods strongly and significantly outperform all baselines on two datasets (short a...
متن کاملKeyword Extraction from a Single Document Using Centrality Measures
Keywords characterize the topics discussed in a document. Extracting a small set of keywords from a single document is an important problem in text mining. We propose a hybrid structural and statistical approach to extract keywords. We represent the given document as an undirected graph, whose vertices are words in the document and the edges are labeled with a dissimilarity measure between two ...
متن کاملGraph-Based Keyword Extraction for Single-Document Summarization
In this paper, we introduce and compare between two novel approaches, supervised and unsupervised, for identifying the keywords to be used in extractive summarization of text documents. Both our approaches are based on the graph-based syntactic representation of text and web documents, which enhances the traditional vector-space model by taking into account some structural document features. In...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014